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Preface 



Seven years ago, when we wrote the first edition of this text, our objective was 
to present, in an easily accessible manner, logistics and supply chain models, al- 
gorithms and tools. The success of that edition, as well as new theory, algorithms 
and recent technological changes, have motivated us to revise the book. In this 
edition, we have attempted to build on the positive elements of the first edition, 
and to include what we have learned in the last few years. 

The first edition of the book grew out of a number of distribution and logistics 
graduate courses we have taught over a period of about ten years. In the first 
few years, the emphasis was on very basic models such as the traveling salesman 
problem, and on the seminal papers of Haimovich and Rinnooy Kan (1985), which 
analyzed a simple vehicle routing problem, and Roundy (1985), which introduced 
power-of-two policies and proved that they are effective for the one warehouse 
multi-retailer distribution system. At that time, few results existed for more com- 
plex, realistic distribution problems, stochastic inventory problems or the integra- 
tion of these issues. 

Interest in logistics and supply chain management, both in industry and in 
academia, has grown rapidly over the past several years. A number of forces have 
contributed to this trend. First, it has become clear that many companies have 
reduced manufacturing costs as much as practically possible. Many of these com- 
panies are discovering the magnitude of savings that can be achieved by planning 
and managing their supply chain more effectively. Indeed, a striking example is 
Wal-Mart’s success, which is partly attributed to implementing a new logistics 
strategy called cross-docking. At the same time, information and communication 
systems have been widely implemented, and provide access to comprehensive data 
from all components of the supply chain. 




In particular, the influence of the Internet and E-commerce on the economy 
in general, and business practice in particular, has been tremendous. Changes 
are happening extremely fast and the scope is breathtaking! For instance, the 
Direct-Business-Model employed by industry giants such as Dell Computers and 
Amazon.com, enables customers to order products over the Internet and thus 
allows companies to sell their products without relying on third party distributors 
or conventional stores. 

Finally, deregulation of the transportation industry has led to the development 
of a variety of transportation modes and reduced transportation costs, while sig- 
nificantly increasing the complexity of logistics systems. 

These developments have motivated the academic community to aggressively 
pursue answers to supply chain research questions. Indeed, in the last five years 
considerable progress has been made in the analysis and solution of logistics and 
supply chain problems. 

This progress was achieved using a variety of techniques. In some cases, the focus 
is on characterizing the structure of the optimal policy and identifying algorithms 
that generate the best possible policies. When this is not possible, the focus has 
been on an approach whose purpose is to ascertain characteristics of the problem 
or of an algorithm that are independent of the specific problem data. That is, the 
approach determines characteristics of the solution or the solution method that 
are intrinsic to the problem and not the data. This approach includes the so-called 
worst-case and average-case analyses which, as illustrated in the book, help not 
only to understand characteristics of the problem or solution methodology, but also 
provide specific guarantees of effectiveness. In many cases, the insights obtained 
from these analyses can then be used to develop practical and effective algorithms 
for specific complex logistics problems. Our objective in writing this book is to 
describe these tools and developments. 

Of course, the work presented in this book is not necessarily an exhaustive ac- 
count of the current state of the art in logistics and supply chain management. 
The field is too vast to be properly covered here. In addition, the practitioner may 
view some of the models discussed as simplistic and the analysis presented as com- 
plex. Indeed, this is the dilemma one is faced with when analyzing very complex, 
multi-faceted, real-world problems. By focusing on simple yet rich models that 
contain important aspects of the real-world problem, we hope to glean important 
characteristics of the problem that might be overlooked by a more detail-oriented 
approach. 

The book is written for graduate students, researchers and practitioners inter- 
ested in the mathematics of logistics and supply chain management. We assume 
the reader is familiar with the basics of linear programming and probability the- 
ory and, in a number of sections, complexity theory and graph theory, although 
in many cases these can be skipped without loss of continuity. The first edition of 
the book focused on: 

• A thorough treatment of performance analysis techniques including worst- 
case analysis, probabilistic analysis and linear programming based bounds. 
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• An in-depth analysis of a variety of vehicle routing models focusing on new 
insights obtained in recent years. 

• A detailed, easy-to-follow analysis of complex inventory models. 

• A model that integrates inventory control and transportation policies and 
explains the observed effectiveness of the cross-docking strategy. 

• A description of advance planning systems for planning and managing im- 
portant aspects of the logistics system. 

We have made substantial changes to the second edition of this text. As we 
continued to teach, consult and research supply chain management issues, we have 
placed an increasing importance on developing effective models for supply chain 
planning, coordination and execution. This is reflected in the second edition; we 
have added a number of chapters and changed the material in some of the original 
chapters to reflect current logistics and supply chain challenges. For example: 

• We added a chapter on Convexity and Supermodularity, two important con- 
cepts in the optimization and economics literature (Chapter 2). 

• We added a chapter on Procurement Contracts (Chapter 10). 

• We added a chapter on Supply Chain Planning models (Chapter 11). 

• We added a chapter on the coordination of inventory replenishment and 
pricing strategies (Chapter 9). 

• We cover risk management models (Chapter 9). 

• We significantly revised the portion of the book that covers classical inven- 
tory models (Chapters 6-8). In particular, we revised the analysis of stochas- 
tic inventory models, both for single facility and multi-echelon supply chains. 

• We added a chapter on Network Planning (Chapter 17) focusing on supply 
chain design, inventory positioning and resource utilization. 

Parts of this book are based on work we have done either together or with others. 
Indeed, some of the chapters originated from papers we have published in journals 
such as Mathematics of Operations Research, Mathematical Programming, Opera- 
tions Research , and HE Transactions. We rewrote most of these, trying to present 
the results in a simple yet general and unified way. However, a number of key re- 
sults, proofs and discussions are reprinted without substantial change. Of course, 
in each case this was done by providing the appropriate reference and by obtain- 
ing permission of the copyright owner. In the case of Operations Research and 
Mathematics of Operations Research, it is the Institute for Operations Research 
and Management Science. Chapters 11 borrows extensively from “Supply Chain 
Design and Planning - Applications of Optimization Techniques for Strategic and 
Tactical Models” written by Ana Muriel and David Simchi-Levi and published 




in the Handbooks in Operations Research and Management Science, the volume 
on Supply Chain Management, S. Graves and A. G. Kok, eds., North-Holland, 
Amsterdam. Similarly, Chapter 17 borrows extensively from Designing and Man- 
aging the Supply Chain, written by David Simchi-Levi, Philip Kaminsky and Edith 
Simchi-Levi and published by McGraw-Hill 2003. 




IX 



ACKNOWLEDGMENTS 

It is our pleasure to acknowledge all those who helped us with the first and 
the second editions of this manuscript. First we would like to acknowledge the 
contribution of our colleague, Dr. Frank Chen of the Chinese University of Hong 
Kong. Similarly, we are indebted to our colleague, Professor Rafael Hassin of Tel- 
Aviv University and a number of referees, in particular, Professor James Ward 
of Purdue University, for carefully reading the manuscript and providing us with 
detailed comments and suggestions. In addition, we would like to thank our former 
Ph.D. students, Philip Kaminsky, Ana Muriel, Jennifer Ryan and Victor Martinez 
de Albeniz who read through and commented on various chapters or parts of earlier 
drafts. Our joint research, their comments and feedback were invaluable. 

We would like to thank Edith Simchi-Levi who is the main force behind the 
development of the network planning systems described in Chapter 17 and who 
carefully edited many parts of the book. 

It is also a pleasure to acknowledge the support provided by the National Science 
Foundation, the Office of Naval Research, the Fund for the City of New York, 
General Motors Corporation, Michelin, SAP and Xerox. It is their support that 
made the development of some of the theory presented in the book possible. 

Of course, we would like to thank our editor Dosanjlr Achi of Springer- Verlag 
who encouraged us throughout, and helped us complete the project. Also, thanks 
to Ehrlich, Jamie and the editorial staff at Springer- Verlag in New York for their 
help. 



David Simchi-Levi, Cambridge, Massachusetts 
Xin Chen, Urbana-Champaign, Illinois 
Julien Bramel, New York, New York 




Contents 



Preface v 

1 Introduction 1 

1.1 What Is Logistics Management? 1 

1.2 Managing Cost and Uncertainty 3 

1.3 Examples 4 

1.4 Modeling Logistics Problems 7 

1.5 Logistics in Practice 7 

1.6 Evaluation of Solution Techniques 9 

1.7 Additional Topics 10 

1.8 Book Overview 11 

1 Performance Analysis Techniques 12 

2 Convexity and Supermodularity 13 

2.1 Introduction 13 

2.2 Convex Analysis 13 

2.2.1 Convex Sets and Convex Functions 13 

2.2.2 Continuity and Differentiability Properties 16 

2.2.3 Characterization of Convex Functions 20 

2.2.4 Convexity and Optimization 23 

2.3 Supermodularity 24 

2.4 Exercises 31 

3 Worst- Case Analysis 33 

3.1 Introduction 33 

3.2 The Bin-Packing Problem 34 

3.2.1 First-Fit and Best-Fit 36 



xi 




Contents 



xii 



3.2.2 First-Fit Decreasing and Best-Fit Decreasing 39 

3.3 The Traveling Salesman Problem 40 

3.3.1 A Minimum Spanning Tree Based Heuristic 41 

3.3.2 The Nearest Insertion Heuristic 42 

3.3.3 Christofides’ Heuristic 46 

3.3.4 Local Search Heuristics 49 

3.4 Exercises 50 

4 Average-Case Analysis 55 

4.1 Introduction 55 

4.2 The Bin-Packing Problem 56 

4.3 The Traveling Salesman Problem 61 

4.4 Exercises 66 

5 Mathematical Programming Based Bounds 69 

5.1 Introduction 69 

5.2 An Asymptotically Tight Linear Program 70 

5.3 Lagrangian Relaxation 73 

5.4 Lagrangian Relaxation and the Traveling Salesman Problem .... 75 

5.4.1 The 1-Tree Lower Bound 76 

5.4.2 The 1-Tree Lower Bound and Lagrangian Relaxation .... 77 

5.5 The Worst-Case Effectiveness of the 1-tree Lower Bound 78 

5.6 Exercises 82 

II Inventory Models 84 

6 Economic Lot Size Models with Constant Demands 85 

6.1 Introduction 85 

6.1.1 The Economic Lot Size Model 85 

6.1.2 The Finite Horizon Model 87 

6.1.3 Power of Two Policies 89 

6.2 Multi-Item Inventory Models 91 

6.2.1 Introduction 91 

6.2.2 Notation and Assumptions 93 

6.2.3 Worst-Case Analyses 93 

6.3 A Single Warehouse Multi-Retailer Model 98 

6.3.1 Introduction 98 

6.3.2 Notation and Assumptions 98 

6.4 Exercises 103 

7 Economic Lot Size Models with Varying Demands 105 

7.1 The Wagner- Whitin Model 105 

7.2 Models with Capacity Constraints Ill 

7.3 Multi-Item Inventory Models 114 




Contents xiii 



7.4 Exercises 116 

8 Stochastic Inventory Models 119 

8.1 Introduction 119 

8.2 Single Period Models 120 

8.2.1 The Model 120 

8.3 Finite Horizon Models 121 

8.3.1 Model Description 121 

8.3.2 K -Convex Functions 123 

8.3.3 Main Results 126 

8.4 Quasiconvex Loss Functions 127 

8.5 Infinite Horizon Models 130 

8.6 Multi-Echelon Systems 137 

8.7 Exercises 139 

9 Integration of Inventory and Pricing 141 

9.1 Introduction 141 

9.2 Single Period Models 142 

9.3 Finite Horizon Models 145 

9.3.1 Model Description 145 

9.3.2 Symmetric A'-Convex Functions 148 

9.3.3 Additive Demand Functions 153 

9.3.4 General Demand Functions 155 

9.3.5 Special Case: Zero Fixed Ordering Cost 156 

9.3.6 Extensions and Challenges 157 

9.4 Risk Averse Inventory Models 158 

9.4.1 Expected utility risk averse models 159 

9.4.2 Exponential utility risk averse models 161 

9.5 Exercises 163 

III Design and Coordination Models 166 

10 Procurement Contracts 167 

10.1 Introduction 167 

10.2 Wholesale Contracts 169 

10.3 Buy Back Contracts 171 

10.4 Revenue Sharing Contracts 172 

10.5 Portfolio Contracts 173 

10.6 Exercises 177 

11 Supply Chain Planning Models 179 

11.1 Introduction 179 

11.2 The Shipper Problem 180 

11.2.1 The Shipper Model 181 




XIV 



Contents 



11.2.2 A Set-Partitioning Approach 182 

11.2.3 Structural Properties 186 

11.2.4 Solution Procedure 187 

11.2.5 Computational Results 190 

11.3 Safety Stock Optimization 194 

11.4 Exercises 196 

12 Facility Location Models 199 

12.1 Introduction 199 

12.2 An Algorithm for the p -Median Problem 200 

12.3 An Algorithm for the Single-Source Capacitated Facility Location 

Problem 204 

12.4 A Distribution System Design Problem 207 

12.5 The Structure of the Asymptotic Optimal Solution 212 

12.6 Exercises 213 

IV Vehicle Routing Models 215 

13 The Capacitated VRP with Equal Demands 217 

13.1 Introduction 217 

13.2 Worst-Case Analysis of Heuristics 218 

13.3 The Asymptotic Optimal Solution Value 223 

13.4 Asymptotically Optimal Heuristics 225 

13.5 Exercises 228 

14 The Capacitated VRP with Unequal Demands 229 

14.1 Introduction 229 

14.2 Heuristics for the CVRP 229 

14.3 Worst-Case Analysis of Heuristics 233 

14.4 The Asymptotic Optimal Solution Value 236 

14.4.1 A Lower Bound 237 

14.4.2 An Upper Bound 240 

14.5 Probabilistic Analysis of Classical Heuristics 242 

14.5.1 A Lower Bound 244 

14.5.2 The UOP(a) Heuristic 246 

14.6 The Uniform Model 248 

14.7 The Location-Based Heuristic 250 

14.8 Rate of Convergence to the Asymptotic Value 253 

14.9 Exercises 254 

15 The VRP with Time Window Constraints 257 

15.1 Introduction 257 

15.2 The Model 257 

15.3 The Asymptotic Optimal Solution Value 259 




Contents 



xv 



15.4 An Asymptotically Optimal Heuristic 265 

15.4.1 The Location-Based Heuristic 265 

15.4.2 A Solution Method for CVLPTW 267 

15.4.3 Implementation 269 

15.4.4 Numerical Study 269 

15.5 Exercises 272 

16 Solving the VRP Using a Column Generation Approach 275 

16.1 Introduction 275 

16.2 Solving a Relaxation of the Set-Partitioning Formulation 276 

16.3 Solving the Set-Partitioning Problem 280 

16.3.1 Identifying Violated Clique Constraints 282 

16.3.2 Identifying Violated Odd Hole Constraints 282 

16.4 The Effectiveness of the Set-Partitioning Formulation 283 

16.4.1 Motivation 284 

16.4.2 Proof of Theorem 8.4.1 285 

16.5 Exercises 288 

V Logistics Algorithms in Practice 292 

17 Network Planning 293 

17.1 Introduction 293 

17.2 Network Design 294 

17.3 Strategic Safety Stock 305 

17.4 Resource Allocation 313 

17.5 Summary 317 

17.6 Exercises 318 

18 A Case Study: School Bus Routing 319 

18.1 Introduction 319 

18.2 The Setting 320 

18.3 Literature Review 322 

18.4 The Problem in New York City 323 

18.5 Distance and Time Estimation 325 

18.6 The Routing Algorithm 327 

18.7 Additional Constraints and Features 331 

18.8 The Interactive Mode 333 

18.9 Data, Implementation and Results 334 

19 References 337 



Index 



350 




List of Tables 



9.1 Summary of Results and Tools for the Inventory (and Pricing) Prob- 
lems 157 

9.2 Summary of Results for Risk Neutral and Risk Averse Models . . . 163 

11.1 Test problems generated as in alakrishnan and Graves 191 

11.2 Computational results for layered networks, alakrishnan and Graves 

results (B&G) versus those of our Linear Programming Based Heuris- 
tic (LPBH) 191 

11.3 Computational results for general networks. Balakrishnan and Graves 

results (B&G) versus those of our Linear Programming Based Heuris- 
tic (LPBH) 192 

11.4 Linear and set-up costs used for all the test problems 193 

11.5 Inventory costs and different ranges for the different test problems. 194 

11.6 Computational results for a single warehouse 195 

17.1 Network Planning Characteristics 318 




List of Figures 



1.1 The Logistics Network 2 

2.1 Examples of Convex Sets and Non-Convex Sets 14 

2.2 Illustration of the Definition of Convex Function 15 

2.3 Illustration of the Definition of Subgradient 19 

3.1 An example for the minimum spanning tree based algorithm with 

n = 18 43 

3.2 An example for the nearest insertion algorithm with n = 8 46 

3.3 The matching and the optimal traveling salesman tour 48 

3.4 An example for Clrristofides’ algorithm with n = 7 48 

4.1 The two traveling salesman tours constructed by the partitioning 

algorithm 62 

4.2 Region partitioning example with n = 17, q = 3, h = 2 and t = 1. . 63 

4.3 The tour generated by the region partitioning algorithm 64 

4.4 The segments Si, ... ,Sk and the corresponding Eulerian graph. . . 65 

6.1 Inventory level as a function of time 86 

6.2 Inventory level as a function of time under policy V 88 

7.1 The plotted points and the function g 109 

8.1 Illustration of the Properties of A'-convex Functions 125 




XX 



LIST OF FIGURES 



9.1 Illustration of the Properties of a symmetric A'-convex Function . 150 

10.1 Illustration of the structure of the optimal ordering policy 177 

11.1 Example of expanded network 182 

11.2 Piece-wise linear and concave cost structure 183 

11.3 Illustration of the model 196 



13.1 Every group contains Q customers with interdistance zero 219 

13.2 An optimal traveling salesman tour in Q (t, s) 219 

13.3 Solution obtained by the ITP heuristic 222 



17.1 The APS screen representing data prior to optimization 296 

17.2 The APS screen representing the optimized logistics network . . . 296 

17.3 The APS screen representing data prior to aggregation 298 

17.4 The APS screen representing data after aggregation 299 

17.5 Transportation Rates for Shipping 4000 lb 301 

17.6 How to read the diagrams 307 

17.7 Current Safety Stock Locations 308 

17.8 Optimized Safety Stock Locations 308 

17.9 Optimized Safety Stock with Reduced Lead Time 309 

17.10Current Supply Chain 310 

17.110ptimized Supply Chain 311 

17.12Global vs. local optimization 312 

17.13Tlre extended Supply Chain: From manufacturing to order fulfillment 3 14 
17.14Comparison of manual versus optimized scenarios 317 




1 

Introduction 



1.1 What Is Logistics Management? 

Fierce competition in today’s global markets, the introduction of products with 
short life cycles and the heightened expectation of customers have forced manu- 
facturing enterprises to invest in and focus attention on their logistics systems. 
This, together with continuing advances in communications and transportation 
technologies (e.g., mobile communication, Internet, and overnight delivery), has 
motivated the continuous evolution of the management of logistics systems. 

In these systems, items are produced at one or more factories, shipped to ware- 
houses for intermediate storage and then shipped to retailers or customers. Con- 
sequently, to reduce cost and improve service levels, logistics strategies must take 
into account the interactions of these various levels in this logistics network , also 
referred to as the supply chain. This network consists of suppliers, manufactur- 
ing centers, warehouses, distribution centers and retailer outlets, as well as raw 
materials, work-in-process inventory and finished products that flow between the 
facilities; see Figure 1.1. 

The goal of this book is to present the state-of-the-art in the science of logistics 
management. But what exactly is logistics management ? According to the Council 
of Logistics Management, a nonprofit organization of business personnel, it is 

the process of planning, implementing, and controlling the efficient, 
effective flow and storage of goods, services, and related information 
from point of origin to point of consumption for the purpose of con- 
forming to customer requirements. 




2 



1. Introduction 



FIGURE 1.1. The Logistics Network. 
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This definition leads to several observations. First, Logistics management takes 
into consideration every facility that has an impact on cost and plays a role in 
making the product conform to customer requirements: from supplier and man- 
ufacturing facilities through warehouses and distribution centers to retailers and 
stores. Indeed, in some supply chain analysis, it is necessary to account for the 
suppliers’ suppliers and the customers’ customers because they have an impact on 
supply chain performance. Second, the objective of logistics management is to be 
efficient and cost-effective across the entire system; total systemwide costs, from 
transportation and distribution to inventories of raw materials, work in process, 
and finished goods, are to be minimized. Thus, the emphasis is not on simply 
minimizing transportation cost or reducing inventories but, rather, on taking a 
systems approach to logistics management. Finally, because logistics management 
evolves around planning, implementing and controlling the logistics network, it 
encompasses many of the firm’s activities, from the strategic level through the 
tactical to the operational level. 

Following Hax and Candea’s (1984) treatment of production-inventory systems, 
logistical decisions are typically classified into three levels. 

• The strategic level deals with decisions that have a long-lasting effect on 
the firm. This includes decisions regarding the number, location and capaci- 
ties of warehouses and manufacturing plants, or the flow of material through 
the logistics network. 

• The tactical level typically includes decisions that are updated anywhere 
between once every week, month or once every quarter. This includes pur- 
chasing and production decisions, inventory policies and transportation strate- 
gies including the frequency with which customers are visited. 

• The operational level refers to day-to-day decisions such as scheduling, 
routing and loading trucks. 

Finally, what about supply chain management? What is the difference between 
supply chain management and logistics management? While the answer to this 
question depends on who is addressing this issue, we will not distinguish between 
logistics and supply chain management in this text. 



1.2 Managing Cost and Uncertainty 

What makes logistics, or supply chain management difficult? Although we will 
discuss a variety of challenges throughout this text, they can all be related to one 
or both of the following observations: 

1. It is challenging to design and operate a logistics system so that systemwide 
costs are minimized, and systemwide service levels are maintained. Indeed, it 
is frequently difficult to operate a single facility so that costs are minimized 
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and service level is maintained. The difficulty increases significantly when an 
entire system is being considered. 

2. Uncertainty is inherent in every logistics network; customer demand can 
never be forecast exactly, travel times will never be certain, and machines 
and vehicles will break down. Logistics networks need to be designed to 
eliminate as much uncertainty as possible and to deal effectively with the 
uncertainty that remains. 

One reason it is difficult to manage cost and uncertainty is due to supply chain 
dynamics. Indeed, in recent years many suppliers and retailers have observed that 
while customer demand for specific products does not vary much, inventory and 
back-order levels fluctuate considerably across their supply chain. For instance, 
in examining the demand for Pampers disposal diapers, executives at Procter & 
Gamble noticed an interesting phenomenon. 

As expected, retail sales of the product were fairly uniform; there is no particular 
day or month in which the demand is significantly higher or lower than any other. 
However, the executives noticed that distributors’ orders placed to the factory 
fluctuated much more than retail sales. In addition, P&G’s orders to its suppliers 
fluctuated even more. This increase in variability as we travel up in the supply 
chain is referred to as the bullwhip effect. 

Even when demand is known precisely (e.g., because of contractual agreements), 
the planning process needs to account for demand and cost parameters varying 
over time due to the impact of seasonal fluctuations, trends, advertising and pro- 
motions, competitors’ pricing strategies, and so forth. These time-varying demand 
and cost parameters make it difficult to determine the most effective supply chain 
strategy, that is, the one that minimizes systemwide costs and conforms to cus- 
tomer requirements. 



1.3 Examples 

In this section we introduce some of the logistics management issues that form 
the basis of the problems studied in the first four parts of the book. These issues 
span a large spectrum of logistics management decisions, at each of the three levels 
mentioned above. Our objective here is to briefly introduce the questions and the 
tradeoffs associated with these decisions. 

Network Configuration 

Consider the situation where several plants are producing products to serve a set 
of geographically dispersed retailers. The current set of facilities, i.e., plants and 
warehouses, is deemed to be inappropriate, and management wants to reorganize 
or redesign the distribution network. This may be due, for example, to changing 
demand patterns or the termination of a leasing contract for a number of existing 
warehouses. In addition, changing demand patterns may entail a change in plant 
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production levels, a selection of new suppliers and, in general, a new flow pat- 
tern of goods throughout the distribution network. The goal is to choose a set of 
facility locations and capacities, to determine production levels for each product 
at each plant, to set transportation flows between facilities, either from plant to 
warehouse or warehouse to retailer, in such a way that total production, inventory 
and transportation costs are minimized and various service level requirements are 
satisfied. 

Production Planning 

A manufacturing facility must produce to meet demand for a product over a fixed 
finite horizon. In many real-world cases it is appropriate to assume that demand is 
known over the horizon. This is possible, for example, if orders have been placed in 
advance or contracts have been signed specifying deliveries for the next few months. 
Production costs consist of a fixed amount, corresponding, say to machine set-up 
costs or times, and a variable amount, corresponding to the time it takes to produce 
one unit. A holding cost is incurred for each unit in inventory. The planner’s 
objective is to satisfy demand for the product in each period and to minimize 
the total production and inventory costs over the fixed horizon. Obviously, this 
problem becomes more difficult as the number of products manufactured increases. 

Inventory Control and Pricing Optimization 

Consider a retailer that maintains an inventory of a particular product. Since 
customer demand is random, the retailer only has information regarding the prob- 
abilistic distribution of demand. The retailer’s objective is to decide at what point 
to reorder a new batch of products, and how much to order. Typically, ordering 
costs consist of two parts: a fixed amount, independent of the size of the order, for 
example, the cost of sending a vehicle from the warehouse to the retailer, and a 
variable amount dependent on the number of products ordered. A linear inventory 
holding cost is incurred at a constant rate per unit of product per unit time. The 
retailer must determine an optimal inventory policy to minimize the expected cost 
of ordering and holding inventory. In some situations the price at which the prod- 
uct is sold to the end customer is also a decision variable. In this case, demand is 
not only random but is also affected by the selling price. The retailer’s objective is 
thus to find an inventory policy and a pricing strategy maximizing expected profit 
over the finite, or infinite, horizon. 

Procurement Strategies and Supply Contracts 

In traditional logistics strategies, each party in the network focuses on its own profit 
and hence makes decisions with little regard to their impact on other partners. 
Relationships between suppliers and buyers are established by means of supply 
contracts that specify pricing and volume discounts, delivery lead times, quality, 
returns, and so forth. The question, of course, is whether supply contracts can 
also be used to replace the traditional strategy with one that optimizes the perfor- 
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mance of the entire network? In particular, what is the impact of volume discount 
and revenue sharing contracts on supply chain performance? Are there pricing 
strategies that can be applied by suppliers to incentivize buyers to order more 
products while at the same time increase the supplier’s profit? What are the risks 
associated with supply contracts and how can these risks be minimized? 

Integration of Production, Inventory and Transportation Decisions 

Consider the problem faced by companies that rely on LTL (Less than TruckLoad) 
carriers for the distribution of products across their supply chain. Typically, these 
carriers offer volume discounts to encourage larger shipments and, as a result, the 
transportation charges borne by the shipper are often piecewise linear and concave. 
In this case, the timing and routing of shipments need to be coordinated so as to 
minimize system-wide costs, including production, inventory, transportation and 
shortage costs, by taking advantage of economies of scale offered by the carriers. 

Vehicle Fleet Management 

A warehouse supplies products to a set of retailers using a fleet of vehicles of limited 
capacity. A dispatcher is in charge of assigning loads to vehicles and determining 
vehicle routes. First, the dispatcher must decide how to partition the retailers into 
groups that can be feasibly served by a vehicle, that is, whose loads fit in a vehicle. 
Second, the dispatcher must decide what sequence to use so as to minimize cost. 
Typically, one of two cost functions is possible: in the first the objective is to 
minimize the number of vehicles used, while in the second the focus is on reducing 
the total distance traveled. The latter is an example of a single-depot Capacitated 
Vehicle Routing Problem (CVRP), where a set of customers has to be served by 
a fleet of vehicles of limited capacity. The vehicles are initially located at a depot 
(in this case, the warehouse) and the objective is to find a set of vehicle routes of 
minimal total length. 

Truck Routing 

Consider a truck that leaves a warehouse to deliver products to a set of retailers. 
The order in which the retailers are visited will determine how long the delivery 
will take and at what time the vehicle can return to the warehouse. Therefore, it 
is important that the vehicle follow an efficient route. The problem of finding the 
minimal length route, in either time or distance, from a warehouse through a set 
of retailers is an example of a Traveling Salesman Problem (TSP). Clearly, truck 
routing is a subproblem of the fleet management example above. 

Packing Problems 

In many logistics applications, a collection of items must be packed into boxes, bins 
or vehicles of limited size. The objective is to pack the items such that the number 
of bins used is as small as possible. This problem is referred to as the Bin-Packing 
Problem (BPP). For example, it appears as a special case of the CVRP when 




1.5 Logistics in Practice 7 



the objective is to minimize the number of vehicles used to deliver the products. 
Bin-packing also appears in many other applications, including cutting standard 
length wire or paper strips into specific customer order sizes. It also often appears 
as a subproblem in other combinatorial problems. 



1.4 Modeling Logistics Problems 

The reader observes that most of the problems and issues described in the previous 
section are fairly well defined mathematically. These are the type of issues, ques- 
tions and problems addressed in this book. Of course, many issues important to 
logistics are difficult to quantify and therefore to address mathematically; we will 
not cover these in this book. This includes topics related to information systems, 
outsourcing, third party logistics, strategic partnering, etc. For a detailed analysis 
of these topics we refer the reader to the book by Simchi-Levi et al. (2003). 

The fact that the examples provided in the previous section can be defined math- 
ematically is, obviously, meaningless unless all required data are available. As we 
discuss in Part V of this book, finding, verifying and tabulating the data are typ- 
ically very problematic. Indeed, inventory holding costs, production costs, extra 
vehicle costs and warehouse capacities are often difficult to determine. Further- 
more, identifying the data relevant to a particular logistics problem adds another 
layer of complexity to the data gathering problem. Even when the data do exist, 
there are other difficulties related to modeling complex real-world problems. For 
example, in our analysis we ignore issues such as variations in travel times, variable 
yield in production, inventory shrinkage, forecasting, crew scheduling, etc. These 
issues complicate logistics practice considerably. 

For most of this book, we assume that all relevant data, for example, production 
costs, production times, warehouse fixed costs, travel times, holding costs, etc., are 
given. As a result, each logistics problem analyzed in Parts I-IV is well defined 
and thus merely a mathematical problem. 



1.5 Logistics in Practice 

How are logistics problems addressed in practice? That is, how are these difficult 
problems solved in the real world. In our experience, companies use several ap- 
proaches. First and foremost, as in other aspects of life, people tend to repeat 
what has worked in the past. That is, if last year’s safety stock level was enough 
to avoid backlogging demands, then the same level might be used this year. If last 
year’s delivery routes were successful, that is, all retailers received their deliveries 
on time, then why change them? Second, there are so-called “rules of thumb” which 
are widely used and, at least on the surface, may be quite effective. For example, 
it is our experience that many logistics managers often use the so-called “20/80 
rule” which says that about 20% of the products contribute to about 80% of total 
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cost and therefore it is sufficient to concentrate efforts on these critical products. 
Logistics network design, to give another example, is an area where a variety of 
rules of thumb are used. One such rule might suggest that if your company serves 
the continental U.S. and it needs only one warehouse, then this warehouse should 
probably be located in the Chicago area; if two are required, then one in Los An- 
geles and one in Atlanta should suffice. Finally, some companies try to apply the 
experience and intuition of logistics experts and consultants; the idea being that 
what has worked well for a competitor should work well for itself. 

Of course, while all these approaches are appealing and quite often result in 
logistics strategies that make sense, it is not clear how much is lost by not focusing 
on the best (or close to the best) strategy for the particular case at hand. Indeed, 
recently, with the advent of cheap computing power, it has become increasingly 
affordable for many firms, not just large ones, to acquire and use sophisticated 
Advance Planning Systems (APS) to optimize their logistics strategies. In these 
systems, data are entered, reviewed and validated, various algorithms are executed 
and a suggested solution is presented in a user-friendly way. Provided the data 
are correct and the system is solving the appropriate problem, these APS can 
substantially reduce system-wide cost. Also, generating a satisfactory solution is 
typically only arrived at after an iterative process in which the user evaluates 
various scenarios and assesses their impact on costs and service levels. Although 
this may not exactly be considered “optimization” in a strict sense, it usually 
serves as a useful tool for the user of the system. 

These systems have as their nucleus models and algorithms in some form or 
another. In some cases, the system may simply be a computerized version of the 
rules of thumb above. In more and more instances, however, these systems ap- 
ply techniques that have been developed in the operations research, management 
science and computer science research communities. 

In this book, we present the current state-of-the-art in mathematical research in 
the area of logistics. Some of the problems listed above represent difficult stochastic 
optimization problems that require concepts such as convexity and supermod- 
ularity, and their extensions for their analysis. Other problems have at their core 
extremely difficult combinatorial problems in the class called AfP-Hard problems. 
This implies that it is very unlikely that one can construct an algorithm that will 
always find the optimal solution, or the best possible decision, in computation time 
that is polynomial in the “size” of the problem. The interested reader can refer 
to the excellent book by Garey and Johnson (1979) for details on computational 
complexity. Therefore, in many cases an algorithm that consistently provides the 
optimal solution is not considered a reachable goal, and hence heuristic, or ap- 
proximation, methods are employed. 
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1.6 Evaluation of Solution Techniques 

A fundamental research question is how to evaluate heuristic or approximation 
methods. Such methods can range from simple “rules of thumb” to complex, com- 
putationally intensive, mathematical programming techniques. In general, these 
are methods that will find good solutions to the problem in a reasonable amount 
of time. Of course, the terms “good” and “reasonable” depend on the heuristic and 
on the problem instance. Also, what constitutes reasonable time may be highly de- 
pendent on the environment in which the heuristic will be used; that is, it depends 
on whether the algorithm needs to solve the logistics problem in real-time. 

Assessing and quantifying a heuristic’s effectiveness is of prime concern. Tradi- 
tionally, the following methods have been employed. 

• Empirical Comparisons: Here, a representative sample of problems is cho- 
sen and the performance of a variety of heuristics is compared. The compar- 
ison can be based on solution quality or computation time, or a combination 
of the two. This approach has one obvious drawback: deciding on a good set 
of test problems. The difficulty, of course, is that a heuristic may perform 
well on one set of problems but may perform poorly on the next. As pointed 
out by Fisher (1995), this lack of robustness forces practitioners to “patch 
up” the heuristic to fix the troublesome cases, leading to an algorithm with 
growing complexity. After considerable effort, a procedure may be created 
that works well for the situation at hand. Unfortunately, the resulting algo- 
rithm is usually extremely sensitive to changes in the data, and may perform 
poorly when transported to other environments. 

• Worst-Case Analysis: In this type of analysis, one tries to determine the 
maximum deviation from optimality, in terms of relative error, that a heuris- 
tic can incur on any problem instance. For example, a heuristic for the BPP 
might guarantee that any solution constructed by the heuristic uses at most 
50% more bins than the optimal solution. Or, a heuristic for the TSP might 
guarantee that the length of the route provided by the heuristic is at most 
twice the length of the optimal route. Using a heuristic with such a guar- 
antee allays some of the fears of suboptimality, by guaranteeing that we 
are within a certain percentage of optimality. Of course, one of the main 
drawbacks of this approach is that a heuristic may perform very well on 
most instances that are likely to appear in a real-world application, but may 
perform extremely poorly on some highly contrived instances. Hence, when 
comparing algorithms it is not clear that a heuristic with a better worst-case 
performance guarantee is necessarily more effective in practice. 

• Average-Case Analysis: Here, the purpose is to determine a heuristic’s 
average performance. This is stated as the average relative error between the 
heuristic solution and the optimal solution under specific assumptions on the 
distribution of the problem data. This may include probabilistic assumptions 
on the depot location, demand size, item size, time windows, vehicle capaci- 
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ties, etc. As we shall see, while these probabilistic assumptions may be quite 
general, this approach also has its drawbacks. The most important includes 
the fact that an average-case analysis is usually only possible for large size 
problems. For example, in the BPP, if the item sizes are uniformly distributed 
(between zero and the bin capacity), then a heuristic that will be “close to 
optimal” is one that first sorts the items in nonincreasing order and then, 
starting with the largest item, pairs each item with the largest item with 
which it fits. In what sense is it close to optimal? The analysis shows that 
as the problem size increases (the number of items increases), the relative 
error between the solution created by the heuristic and the optimal solution 
decreases to zero. Another drawback is that in order for an average-case 
analysis to be tractable it is sometimes necessary to assume independent 
customer behavior. Finally, determining what probabilistic assumptions are 
appropriate in a particular real-world environment is not a trivial problem. 

Because of the advantages and potential drawbacks of each of the approaches, 
we agree with Fisher (1980) that these should be treated as complementary ap- 
proaches rather than competing ones. Indeed, it is our experience that the logistics 
algorithms that are most successfully applied in practice are those with good per- 
formance in at least two of the above measures. 

We should also point out that characterizing the worst-case or average-case 
performance of a heuristic may be technically very difficult. Therefore, a heuristic 
may perform very well on average, or in the worst-case, but proving this fact may 
be beyond our current abilities. 



1.7 Additional Topics 

We emphasize that due to space and time considerations we have been obliged 
to omit some important and interesting results. These include results regard- 
ing yield management, machine scheduling, random yield in production, dynamic 
and stochastic fleet management models, etc. We refer the reader to Graves et 
al. (1993), Ball et al. (1995), and De Kok and Graves (2003) for excellent surveys 
of these and other related topics. 

Also, while there exist many elegant and strong results concerning approaches 
to certain logistics problems, there are still many areas where little, if anything, is 
known. This is, of course, partly due to the fact that as the models become more 
complex and integrate more and more issues that arise in practice, their analysis 
becomes more difficult. 

Finally, we remark that it is our firmly held belief that logistics management is 
one of the areas in which a rigorous mathematical analysis yields not only elegant 
results but, even more importantly, has had and will continue to have, a significant 
impact on the practice of logistics. 
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1.8 Book Overview 

This book is meant as a survey of a variety of results covering most of the logistics 
area. The reader should have a basic understanding of complexity theory, linear 
programming, probability theory and graph theory. Of course, the book can be 
read easily without delving into the details of each proof. 

The book is organized as follows. In Part I, we concentrate on performance 
analysis techniques. Specifically, in Chapter 2 we introduce the concepts, and as- 
sociated properties, of convexity and supermodularity. In Chapter 3 we discuss 
some of the basic tools required to perform worst-case analysis, while in Chapter 
4 we cover average-case analysis. Finally, in Chapter 5 we investigate the perfor- 
mance of mathematical programming based approaches. 

Part II concentrates on production and inventory problems. We start with lot siz- 
ing in two different deterministic environments, one with constant demand (Chap- 
ter 6) and the second with varying demand (Chapter 7). Chapter 8 focuses on 
stochastic inventory models while Chapter 9 presents new results for the coor- 
dination of inventory and pricing decisions. The chapter distinguishes between 
models appropriate for risk neutral and risk averse decision makers. 

Part III deals with supply chain design and coordination models. These include 
Chapter 10 that focuses on effective supply contracts, such as Buy Back, Revenue 
Sharing and Portfolio contracts. Chapter 11 deals with models that integrate pro- 
duction, inventory and transportation decisions across the supply chain. Finally, 
Chapter 12 analyzes distribution network configuration and facility location, also 
referred to as site selection, problems. 

In Part IV, we consider Vehicle Routing Problems, paying particular attention to 
heuristics with good worst-case or average-case performance. Chapter 13 contains 
an analysis of the single-depot Capacitated Vehicle Routing Problem when all 
customers have equal demands, while Chapter 14 analyzes the case of customers 
with unequal demands. In Chapter 15 we perform an average-case analysis of the 
Vehicle Routing Problem with Time Window constraints. We also investigate set- 
partitioning based approaches and column generation techniques in Chapter 16. 

In Part V, we look at the practice of logistics management and in particular 
at issues related to the design, development and implementation of APS. Specif- 
ically, in Chapter 17 we look at a network planning issues from logistics network 
design, through inventory positioning all the way to resource allocation. Finally, 
in Chapter 18 we report on the development of a decision support tool for school 
bus routing and scheduling in the City of New York, 
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Convexity and Supermodularity 



2.1 Introduction 

The concepts of convexity and supermodularity are important in the optimization 
and economics literature. These concepts have been widely applied in the analysis 
of a variety of supply chain models, for stochastic, multi-stage inventory problems 
to pricing models. Hence, in this chapter we provide a brief introduction to con- 
vexity and supermodularity focusing on material most relevant to our context. For 
more details, readers are referred to the two excellent books Rockafellar (1970) and 
Topkis (1998). 



2.2 Convex Analysis 

2.2.1 Convex Sets and Convex Functions 

Definition 2.2.1 A set C C 3?" is called convex, if for any x,x' £ C and A € 
[ 0 , 1 ], (1 — X)x + \x' € C. 

Geometrically, a set is convex if and only if for any two points in the set, the 
line segment between these two points also lies in the set. Here are some simple 
examples of convex sets: an interval in JR 1 ; a disk and a square in JR 2 ; a sphere and a 
cube in JR 3 . Also note that a set of solutions of a system of linear inequalities, that is 
{a; £ JR" : Ax < b}, is convex, where A is a linear mapping from JR™ to JR m and b is 
a vector in JR 1 ™. Finally, the intersection of convex sets is also convex, and convexity 
is preserved under a linear transformation, i.e., the set AC + b := {Ax + b\x € C } 
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is still convex if C is. 







Convex sets 



Non-convex sets 



FIGURE 2.1. Examples of Convex Sets and Non-Convex Sets 



Definition 2.2.2 Given a set C in 5ft™, a function / : C — > 5ft is convex over set 
C, if for any x, x' £ C and A £ [0, 1], 



f is strictly convex if the inequality (2.1) holds strictly for any x,x' £ C with 
x ^ x’ and A £ (0,1). Finally, f is called (strictly) concave if —f is (strictly) 
convex. 

Remark: When / is (strictly) convex over 5ft™, we simply say that / is (strictly) 
convex. From now on, we mainly focus on the case when C = 5ft™ to simplify our 
presentation. In fact, almost all the results about convex functions defined over 5ft™ 
hold for convex functions defined over a convex subset of 5ft™, possibly with minor 
modification. 

Sometimes it is convenient to work with functions that take the value of infinity. 
In this case, for a given convex set C in 5ft™, a function / iC— >5ftU{oo}is convex 
over C in the extended sense if the inequality (2.1) still holds for /. The arithmetic 
convention here includes oo + oo = oo, 0 • oo = 0, and a • oo = oo for a > 0. Of 
course, usually we can restrict ourself to the effective domain of function /, which 
is defined as follows: 



But in some occasions, it is more economical to use convex functions in the ex- 
tended sense. Finally, for a convex function / : C — > 5ft, define 



/((l - A)x + \x’) < (1 - A )f{x) + Xf(x’). 



(2.1) 



dom(/) := {x £ C \ /( x) < oo}. 




/( x), if x £ C, 
oo, otherwise. 
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It is easy to see that / : R" — > R U { 00 } is convex in the extended sense. 

It is also interesting to point out that if a function / : R n — > 5ft is continuous, then 
the convexity of the function / is equivalent to the following: for any x,x' £ R”, 
f((x + x')/2) < 1/2 (f(x) + f(x')). This is left as an exercise. 

We now establish a connection between convex sets and convex functions through 
the epigraph mapping. The epigraph of a function / : C — > R is defined as 

epi(/) := {(x, a) | x £C,a £ iR,f(x) < a}. 

It is easy to verify that a function / is convex on a convex set C if and only if its 
epigraph epi(/) is convex. The epigraph mapping allows us to translate properties 
of convex sets into results about convex functions. 

The graphical meaning of a convex function is clear; see Figure 2.2 for an il- 
lustration. In fact, a function / is convex if and only if for any given x and x' , 
the curve (A, /((l — X)x + Ax')) for A £ [0, 1] always lies below the line segment 
connecting two points (x, f(x)) and (x\ f(x')). Obviously, a linear function is both 
convex and concave. 




In the following, we summarize some useful properties of convex functions. The 
proof for those properties is straightforward. 

Proposition 2.2.3 (a) Any positively linear combination of convex functions is 

convex. That is, if fi : R ra — > R are convex for i = 1,2, ... ,m, then for any 
scalar ai > 0, a ifi °^ so convex. 

(b) A composition of a convex function and an affine function is still convex. 
That is, if f : R m — > R is convex, then for any linear mapping A from R n 
to R m and a vector b in R m , f{Ax + b) is also a convex function of x £ R". 

(c) A composition of an increasing convex function and a convex function is still 
convex. That is, if f : R — > R is convex and nondecreasing and g : R" — > R 
is convex, then f(g(x)) is convex. 



16 



2. Convexity and Supermodularity 



(d) If fk ■ 3?” — > 3? is convex for k = 1, 2, . . . and lim^oo fk{x) = f(x) for any 
x £ 3? n , then f(x) is convex. 

(e) Assume that a function /(•,•) is defined on the product space 3 x ?ft m . 
If fi'iV) i s convex for any given y € 3ff m , then for a random vector £ in 
$t m , E(\f { £, £)] is convex, provided it is well defined. As a special case, if 
f : 3?" — > 3? is convex, then E^[f{x — £)] is also convex. 

A weaker definition of convex functions is quasi-convex, which is also commonly 
used. 

Definition 2.2.4 A function f : 3?" — » 3 it is called quasi-convex on a convex set 
C, if for any x, x' £ 3?" and A £ [0, 1], 

/((l - A)a; + Ax') < ma x{f(x),f(x')}. 

Quasi-convexity of function / : 3?" — > 3? is equivalent to the fact that — / is 
unimodal. That is, if x* is a global maximizer of function — /, then for any x £ 3?”, 
— /(( 1 — A) a: + Ax*) is a nondecreasing function of A for A £ [0, 1]. 

For a function / : C — > 3? and a given a £ 3?, define the level set of / as 

L a {f) ■■= {x£C | f(x) < a}. 

One can show, from the definition of quasi-convexity, that / is quasi-convex on a 
convex set C if and only if the level set L a (f) is convex for any a £ 3?. 

2.2.2 Continuity and Differentiability Properties 

In this section, we discuss the continuity and differentiability properties of convex 
functions. Before we proceed to prove the continuity of convex functions, observe 
that the convexity of a function / : 3?” — > 3? is equivalent to the following: for any 
x 1 £ 3?", A i £ 3? with A,; > 0 (i = 1,2,..., m) and YfnLi \ = 1> 

m m 

/(£^)<£w)- (2.2) 

i—l ?'-l 

This is a special case of the Jensen’s inequality. 

Theorem 2.2.5 If f : 3?" — > 3? is convex, then it is continuous. 

Proof. We only need to show, without loss of generality, that / is continuous at 
x = 0. 

First we argue that f(x) is bounded above over the set S = {x £ 3?" | ||x||i = 
i \ x i\ < !}■ Let e{ (i = 1,2,... ,n) be a unit vector in 3?" with 1 at its ith 
component and 0 at other components, and let e n+ i = —ei (i = 1,2,... ,n). Then 
for any x £ S, there exists A* > 0 (i = 1,2,..., 2 n) with Xu=i A» = 1 such that 
x = Ajej. Therefore (2.2) implies that f(x) < maxj-i^^.^n f( e i)- 
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We now show that for any sequence x k € 3?” (k = 1,2,...,) convergent to x = 0, 
/( x k ) converges to /( 0). Since x k converges to 0, we can assume without loss of 
generality that x k G S for all k. The definition of convex functions implies that 

f(x k )<(l~\\x k \\ 1 )f(0) + \\x k \\ 1 f(x k /\\x k \\ 1 ). 



Letting k tend to infinity, we have 

,En f(x k ) < /( 0). 



Also observe that 0 = (1 - 1 |yJ|| i )x fc + ^ Jlj^ Again, the definition of 

convex functions implies that 



/( 0 ) < (1 - 



Mil 



1 + ||a; fe | 



~)f(x k ) 



l^lli 



1 + llaA’I 



-/(-a 



Letting k tend to infinity, we have 



/( 0) < lim f(x k ). 

k—> oo 



Thus, limfc_> f(x k ) = /( 0) for any sequence x k convergent to x = 0 and therefore 
/ is continuous at 0. I 

Thus a convex function is always continuous. A natural question is whether 
a convex function is differentiable. Unfortunately it is not always the case. For 
example, the absolute value function \x\ is convex while not differentiable at x = 0. 
Even though a convex function may not be differentiable, we will show in the 
following that for a convex function, its directional derivative always exists and 
possesses nice properties. Recall that for any x,y € 5i n , the directional derivative 
of a function / : $t n — > SR at x is defined as follows: 



/' 0 ; V ) : = 



/( x + ty) - /( x) 
I 



For a function / defined on one dimensional space, let f' + (x) = /'(a;;l) and 
f'_{x) = —f'{x\ —1), and define 



D f {x,t) 



f(x + t)~ f{x) 
t 



The following result, which has been widely applied, is helpful in establishing 
monotonicity properties of f' + and f'_. 



Proposition 2.2.6 Assume that a function f : 3? — > 3? is convex. Then for any 
x, x' , t, t' € 3? with x < x' and 0 < t < t! or x < x + t < x' < x' + t' , 



D f (x,t) < Df(x',t'). (2.3) 

In particular, when t = t' , f has increasing differences, that is, for any x < x' ,t > 

0, 

/( x + t)~ f(x) < f(x' + t)~ f(x'). 
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Proof. Observe that x < x + t, x' < x' + t. There exist A, A' £ [0, 1] such that 
x + t = (1 — X)x + \(x' + t ) and x' = (1 — X')x + X'(x' + t). The definitions of A 
and A' imply that A + A' = 1. From the convexity of /, we have 

fix + t) < (1 - A )f(x) + X fix' + t), 

and 

fix') < (1 - X')f(x) + X'f(x' + t). 

Adding the two inequalities together and rearranging terms, we have that 

f(x + t) - f(x) < f{x' + t) - f(x'). (2.4) 

Thus, a convex function has increasing differences. 

We now assume that 0 < t < t' . From the convexity of /, we have that 

fix' + t) < (1 - J,)f(x') + Jifix' + t'), 

which immediately implies that 

fix' + t)- fjx') < f{x' + f) - f(x') 
t ~ t’ 

This inequality, together with the inequality (2.4), implies the inequality (2.3). 

Finally, assume that x < x + t < x' < x' + t' . Again the convexity of / implies 
that 

fix + t) < (1 - X)f(x) + X fix'), 

and 

fix') < (1 — X')fix + 1) + X' fix' + t ' ), 

where A = ^ and A' = ■ The above inequalities are equivalent to 

the following: 

fjx + t)~ fjx) < fjx') - fjx + t) 
t — x' — ix + t) ’ 

and 

fjx') - fjx + t) < fjx' + t') - fjx') 

X 1 — ix + t) — t' 

Therefore, the inequality (2.3) holds if x < x + t < x' < x' + t' . The continuity of 

convex functions implies that (2.3) is still true if x < x + t < x' < x' + t' . I 

Theorem 2.2.7 Assume that f : 5ft — » is convex. Then 

(a) f' + and f'_ are well defined, and for any x,x' € 5ft with x < x' , f'_ix) < 

/;(*) < f-ix'). 

(b) For any fixed x € 5ft, 

fix + t)~ fix) > £t 

for any t £ 5ft if and only if f £ [f'_ix), f' + ix)]. 
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Proof. From (2.3), we know that Df(x,t) is a nondecreasing function of t for 
t > 0 (simply let x' = x) and is bounded below by Df(x',t') for any x' < x and 
0 <t' <x — x' . Therefore, 

f + {x) = inf D f (x, t). 

Similarly, D f(x — t, t) is non-increasing in t and hence 

f-(x) = sup Df(x — 

ao 

Indeed, define a new convex function g(x) = f(—x) and the non-increasing prop- 
erty of Df(x — t,t) follows from applying Proposition 2.2.6 to the function g. 
Again from (2.3), it is easy to see that for any x < x' and 0 < t < x' — x, 

Df(x — t,t) < Df(x,t) < Df{x' — 

Letting t | 0 yields f_(x) < f' + (x) < f'_(x'). Thus, part (a) is true. 

Finally, part (b) is a direct consequence of the proof for part (a). I 

Theorem 2.2.7 part (b) implies that for any £ € [f'_(x), fL(x)}, the function 
f(x') always lies above the line L x = {(x',f(x) + £(x' — x)) | x' € } for any x; 

see Figure 2.3 for an illustration. This result can be extended to convex functions 
defined in 3?". For this purpose, we introduce the concept of subgradient. 




FIGURE 2.3. Illustration of the Definition of Subgradient 



Definition 2.2.8 Given a function f : 3?" — > 3?, £ £ 3?" is a subgradient of the 
function f at x £ 3?", if for any t € 3?”, 

f(x + t) - f(x) >(£,t), (2.5) 

where (£, t) = Y^i = t the inner product between £ and t. Let the subdifferential 

df(x) be the set of all subgradients of f at x. 
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The following theorem characterizes properties of subgradients. The proof of 
these properties is omitted since it is quite involved; see Rockafellar (1970) for 
more details. 

Theorem 2.2.9 Assume that a function f : 3?" — > 3? is convex. Then 

(a) For any x £ 3? n , df(x) is nonempty, convex and compact. 

(b) For any x,y £ 5ft”, 

f'(x;y)= sup {f,y). 

£edf(x) 

(c) f is differentiable at x £ SR n if and only if df(x) = {V/(x)}. 

(d) For any compact set C C 5ft", L) xe cdf(x) is compact. 

We now present the general form of the Jensen’s Inequality. 

Proposition 2.2.10 Let f be a convex function over 3? and C, is a random variable 
with finite expectation E[Q. Then 

f(E[C}) < E[f(0]. 

Proof. This proposition can be proven by using the special case of the Jensen’s 
inequality (2.2) and the continuity of convex functions as well as the definition 
of expectations. We present an alternative approach based on the properties of 
subgradients. 

Choose any £ £ df(E[Q). From the definition of subgradients, we have that 

/(C) -/(£[C])>(£,C -^[CD- 

Taking expectations on both sides yields f(E[ C]) < E[f(Q]. I 

2.2.3 Characterization of Convex Functions 

The concept of convexity is widely used in optimization. However, identifying 
convex functions is not always simple. In this section, we give some sufficient and 
necessary conditions for a differentiable function to be convex. 

Theorem 2.2.11 Consider a function f : 3?" — > 3?. 

(a) If f is differentiable, then f is convex if and only if for any x,x' £ 3t n , 

f(x') - f(x)> (Vf(x),x' -x), (2.6) 

where (x,y) = x iVi * nner product of x,y £ 3? ra . 

(b) If f is differentiable, then f is strictly convex if the inequality (2.6) holds 
strictly for any x x' . 
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(c) If f i s continuously differentiable, then f is convex if and only if V / is 
monotone, that is, for any x,x' € 9J n , (V/(x') — V f(x),x' — x) > 0. 

(d) If f is twice continuously differentiable, then f is convex if and */V 2 /(x) is 
positive semi-definite for any x € 9J n . 

(e) If f is twice continuously differentiable, then f is strictly convex */V 2 /(x) 
is positive definite for any x € 3?" . 

(f) Assume that f(x) = x T Qx for a symmetric matrix Q of order n and x £ 9J™ . 
Then f is convex if and only if Q is positive semi- definite. And f is strictly 
convex if and only if Q is positive definite. 

Proof. Assume that / is differentiable. Pick any x, x' £ 3?” and define 

0(A) '■= f( x + K x ' - x ))■ 

First notice that /( x) is convex in x if and only if <f>( A) is convex for A € [0, 1] for 
any picked x,x' £ 9i". Also observe that 

4>\X) = (V/(x + X(x r — a;)), x' — x)). 

If <f>( A) is convex in A, then Theorem 2.2.7 part (b) implies that 

f( x ') - f ( x ) = <£(1) - 0(0) > 0'(O) = (Vf(x),x' - x). 

Hence the inequality (2.6) holds for any x,x’ £ $t n . On the other hand, if the 
inequality (2.6) is true for any x,x' £ 9J", then for any z = (1 — X)x + Xx' with 
A £ [0, 1], we have 

f(x)~ f(z) > {S7f(z),x- z), 

and 

f( x ') - f{z) > (Vf(z),x' - z). 

Multiplying the first inequality by (1 — A) and the second inequality by A and 
summing them up, we end up with 

/(( 1 - A)x + Xx') < (1 - X)f(x) + Xf(x'). 

Thus / is convex and part (a) is true. Obviously, from the above argument, one 
can see that / is strictly convex if the inequality (2.6) hold strictly. Therefore, part 
(b) holds. 

For part (c), notice that if f>(X) is convex, then Theorem 2.2.7 part (a) implies 
that 0 / (O) < 0 / (l). Thus, 

(VjV) - S7f(x),x' -x) = 0'( 1) - 0'(O) > 0. 

On the other hand, if V/ is monotone, then for any A' > A > 0, 

0 / (A / ) = (V/(a; + X'(x' — x)),x' — x)) > (V/(x + X(x' — x)),x' — x)) = <f>'(X). 
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Therefore, 0'( A) is nondecreasing, and hence 



<m a) = m + [ x < m + a^(a), 


(2.7) 


and 

pi 




0(A) = 0(1) - j 0'(£K < m - (1 - A)0'(A). 


(2.8) 


Multiplying the first inequality by (1 — A) and the second inequality by A and 
summing them up, we end up with 


0(A) < (1 — A) 0(0) + A0(1), 


(2.9) 



that is, 0 is convex for A £ [0, 1]. Thus / is convex. Also notice that from the above 
proof, V/ is monotone if and only if 0'(A) is nondecreasing for A £ [0, 1]. 

We now assume that / is twice continuously differentiable. In this case, 

4>" (X) = ( x' — x , V 2 /(x + X(x' — x))(x' — x)). 

Notice that for any 0 < A < A' < 1, 



4>'{ A') - f {£)<%. 

Therefore, if V 2 /(x) is positive semi-definite for any x £ 5ft", then 0"(£) > 0 for 
any £ and hence <j/( A) is nondecreasing, which in turn implies that / is convex as 
we already proved for part (c). On the other hand, the convexity of / implies that 
<// is nondecreasing, which in turn implies that (j)"{ A) > 0 for any A £ [0,1]. In 
particular, we have 



0 < 0"(O) = ( x ' — x, X7 2 f(x)(x r — x)}. 

Since x' £ 5ft" is arbitrary, we have that X7 2 f(x) is positive semi-definite. This 
proves part (d). 

If V 2 /(x) is positive definite for any x £ 9ft", then for x ^ x' , 0'(A) is strictly 
increasing for A £ [0, 1] and at least one of the inequalities (2.7) and (2.8) holds 
as a strict inequality. Hence the inequality (2.9) holds strictly. This implies that 0 
and therefore / is strictly convex. Thus part (e) holds. 

Finally, for part (f), we only need to prove that if / is strictly convex, then Q is 
positive definite. The remaining results are special cases of parts (d) and (e). If / is 
strictly convex, then part (d) implies that Q is positive semi-definite. Assume to the 
contrary that Q is not positive definite. There exists a nonzero vector £ £ 5ft" such 
that f(z) = (z, Qz) = 0. Therefore for any x £ 9ft", f(x + X z) = /( x) + 2X(x , Qz), 
which is a linear function of A. Thus / is not strictly convex; a contradiction. Hence 
Q is positive definite if / is strictly convex. I 
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2.2.4 Convexity and Optimization 

Convexity plays an important role in optimization theory. In particular, we show 
in the following that a local minimizer of a convex minimization problem, i.e., the 
problem of minimizing a convex function, is a global minimizer of this problem, 
and the first order optimality condition is both sufficient and necessary for a point 
to be a global minimizer. As we shall see, this result has implications both for 
optimization theory and algorithms. 

Theorem 2.2.12 Assume that f : 3?” — » 3? is convex. If x* is a local minimizer 
of f, then x* is a global minimizer of f. Furthermore, x* is a global minimizer of 
f if and only if 0 £ df(x*). 

Proof. If x* is a local minimizer, then there exists a ball B e { x*) = {x £ 3?" | \\x — 
a :* || 2 < e} for some e > 0 such that f(x) > f(x*) for any x £ B e (x*). Moreover, 
for any x £ 3?", there exists A £ (0, 1) such that (1 — A)x* + Xx £ B e . From the 
definition of convexity, we have 

fix*) < /((l - X)x* + Ax) < (1 - A )f(x*) + A f{x). 

The inequality implies that fix*) < f(x) for any x £ 3?". Hence x* is a global 
minimizer of / and from the definition of subgradient, we have 0 £ d fix*). Finally, 
if 0 € df{x*), then the definition of subgradient implies that fix) > fix*) for any 
x £ Sft n . In other word, x* is a global minimizer of the function /. Therefore, x* is 
a global minimizer of / if and only if 0 € df{x*). I 

The following result is a straightforward consequence of the definition of strictly 
convex functions. 

Theorem 2.2.13 A strictly convex function f : 3?" — > 3? has at most one local 
and global minimizer. 

We now consider the convex function maximization problem. If / : 3J — > 3? is 
convex, then from the definition of convexity, one can see that either a or b is 
an optimal solution for the problem max l£ i 0i H fix). More generally, we have the 
following result regarding the convex function maximization problem. 

Theorem 2.2.14 Assume that a set C C 'ft n is compact and f : 3?" — * 3? is 
convex. Then max l£ c fix) achieves maximization at an extreme point x* of C. 
That is, there exists no x,x' £ C with 1 / 1 ' such that x* = {x + x')/2. 

We provide some intuition to the theorem instead of a formal proof. Assume that 
a maximizer of fix) over the set C, x* , is not an extreme point of C. Then there 
exist x, x' £ C with i/i 1 such that x* = (x + x') /2. Let L = {x* + t{x' — x) £ 
C\t £ 3?} be a line segment in C. It is clear, from the definition of convexity, that 
all points on the line segment L are maximizers of the function fix). Let x be one 
of the endpoints of L. If x is an extreme point of C , we are done; otherwise, we 
can repeat the above process and the theorem follows since such a process cannot 
proceed infinite number of times. 
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The following proposition shows that under some conditions, convexity is pre- 
served under optimization operations. 

Proposition 2.2.15 Given a function /(•,•) defined on the product space 9?" x 
3? m . 

(a) If f(-,y ) is convex for any given y € 3ff m , then for any set C C 3 ff m , the 
function h : 5ft™ — > 3?U {oo} defined by 

h(x) := sup f(x,y) 

yec 

is also convex (possibly in the extended sense). 

(b) Assume that for any x € 3ft n , there is an associated convex set C( x) C 3 
and C := {(x,y) \ y £ C(x),x £ 3?"} is convex. If f is convex and the 
function 

9(x)~ inf f{x,y) 

y&C(x) 

is well defined, then g is also convex over 3?". 

Proof. To prove part (a), observe that for any given y £ C, the set 

epi (f(-,y)) = {(x,a) | x £ 3?”, a G 3 ft,f(x,y) < a} 

is convex. Therefore epi (h) = n J/e cepi(/(-, y)) is a convex set, which implies that 
h is convex. 

For part (b), let us fix x, x' £ 3?" and A £ [0, 1]. From the definition of infimum, 
there exists, for any given e > 0, y, y' £ C such that 

/O, y) < g{x) + e and f(x', y') < g(x') + e, (2.10) 

Since C is convex, we have that ((1 — A) a; + Ax', (1 — A )y + At/') £ C. Thus 
(1 — A )y + A y' £ C(( 1 — X)x + Ax') and 

fl((l - A)x + Ax') < /((l - A)x + Ax', (1 - X)y + \y') 

< (1 — A)/(x, y') + A/(x', y') 

< (1 - X)g(x) + Xg(x') + e, 

where the second inequality holds since /(•,•) is convex and the last inequality 
follows from (2.10). Since e is arbitrary, we conclude that g is convex. I 



2.3 Supermodularity 

In this section, we introduce the concept of supermodularity. Even though this 
concept can be defined using partially ordered set, for our purpose we focus on 
the Euclidean space 3?”. 
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To present the definition of supermodularity, we first introduce two operations, 
join and meet operations, in 3ft". For any two points x = {x\,x 2 , ■ ■ ■ ,x n ) and 
x' = (x'i, x' 2 , ■ . ■ , x' n ) in 5ft", define their join as 

x V x = (max{si, 2 ;j}, max{x 2 , x' 2 }, ■ ■ . , max{j;„, x' n }), 

and their meet as 

x A x' = (min{xi, x’ff, min{a’ 2 , , • • • , min{x„, x' n }). 

Of course, if x < x ' , that is, Xi < x[ for i = 1, 2-y. . , n, x V x' = x' and x A x' = x. 
A set X C 5ft" is called a lattice if for any x, x' € X, x V x' , x A x' G X. 

Definition 2.3.1 Suppose X is a subset in 5ft" and a function f : X — > 5ft. The 
function f is supermodular on the set X, if for any x,x' € X, 

f{x) + f{x') < f(x V x') + /( x A x'), (2.11) 

whenever x A x' , x V x' G X. f is strictly supermodular, if the inequality (2.11) 
holds strictly for unordered pairs x and x' , that is, none of x < x' and x > x' is 
true. A function f is (strictly) submodular if —f is (strictly) supermodular. 

A closely related and more intuitive concept is (strictly) increasing differences. 
A function / : 5ft 2 — » 5ft has (strictly) increasing differences, if for any t, t' G 5ft with 
t<t'{t< t'), f(x,t') — f(x,t) is (strictly) increasing in x. 

This concept of (strictly) increasing differences can be extended to functions 
defined on 5ft". For this purpose, we define, for a function / : 5ft" — » 5ft, any pair of 
indexes i.j G {1,2,..., n} and any vector 

X'ij (x i , . . . , Xi— i , Xi- (-1 , . . . , Xj — i , Xj-\. \ , . . . , Xji} G 5ft , 

a function 

fxij {.Xi , Xj ) f (x 1 , * . . , Xi —\ , Xi , Xi-\-i , ... , Xj — i , Xj , X jj- \ , ... , X n ) . 

The function / : 5ft" — > 5ft has (strictly) increasing differences, if the function 
fxij (xi > Xj) has (strictly) increasing differences for any pair of distinct indexes 
i,j G {1,2,..., n} and any vector Xij G 3ft" -2 . 

The following result shows that for functions defined on 3ft", the concept of 
supermodularity is equivalent to the property of increasing differences. 

Theorem 2.3.2 A function f : 3ft" — > 3ft is (strictly) supermodular if and only if 
f has (strictly) increasing differences. 

Proof. Assume that / has increasing differences. Then, for any x,x' G 3ft", we 
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have 

f{ x ) - f(x A X 1 ) = Y^l(f( x li---? x i’ X i+l Ax i+l’---’ X n Ax n) 

- f(x 1 , . . A x',x i+ i A x - +1 , ... ,x n A x' n )) 

< n=i(/(*i viiv.ii-iv s'i-r, ^+i, •••,<) 

- /(xi V x{, . . . , 2 ,_i V x'_ 1; X; A a;-, x- +1 , . . . , x'J) 

= E2=i(/(*i v *i-j • • • > aii-i V a;'.!, a q V a;', x' +1 , . . . , x' n ) 

- f(x i V x{, . . . , Xj_i V x'_ 1; a;', x' +1 , . . . , x' n )) 

= f(xV x 1 ) - f(x'), 

where the inequality holds since / has increasing differences and the second equal- 
ity holds since {a:*, a;'} = {a;* V a;', a;* A a;'}. Hence / is supermodular. 

Assume now that / is supermodular. For any pair of distinct indexes i,j G 
{1,2,..., n}, any vector 

Xij (a.‘i , . . . , Xi —\ , aj ? ;_(_ i , . . . , Xj— r , x^-f-r , . . . , x^) €1 3^ 

and Xi,x[, Xj , a;'- 6 B with Xi < x\ and Xj > a;' , let 

x (aq , . . . , x i — i , Xi , , . . . , Xj _i , Xj , a^pr , . . . , x n ) 



X (xi , . . . , Xi — i , Xi , X , . . . , Xj _i , Xj , Xj-j_r , . . . , Xtj) . 



The supermodularity of / implies that 



/*« (a*, *j) - /x y (Xi, a' ) = /(x) - /(x A x') 

< /(x V x') - /(x') 

= /•< ;J ( 

Thus / has increasing differences. 

Finally, the equivalence of the strict supermodularity and the strictly increasing 
differences can be established by following a similar argument. I 

If a function / : 3?" — > 3? is differentiable, it is easy to verify that / is supermod- 
ular if and only if the partial derivative is nondecreasing in xp for all distinct 
indexes i and i' and for any x € 3?™. Furthermore, if / is twice differentiable, then 
/ is supermodular if and only if > 0 for any distinct indexes i and i! and 

for any x € 3? n . 

From the definition of supermodularity, we can easily conclude that a function 
/ : 3? — * 3? is both supermodular and submodular. In fact, the reverse is essentially 
true. 



Theorem 2.3.3 A function f : 3?" — > 3ft is both supermodular and submodular if 
and only if f is separable, that is, there exists functions fi : 3? — > 3? (* = 1, 2, . . . , n) 
such that /(x) = M x i) f or an V x = (xi, x 2 , . . . , x n ) € 3?". 



Proof. The “if’ part is obvious, since as we already pointed out, a function defined 
on 3? is both supermodular and submodular. Hence we focus on the “only if’ part. 
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Assume that f : 3?” — > 3? is both supermodular and submodular. Then 

f(x) = /( 0) + ET=i(/(*i. • • • > 0, . . . , 0) - f(x i, . . . , Xi_i, 0, 0, . . . , 0)) 
= /(0) + E?=i(/(0, • • • ■ , 0, Xi, 0, . . . , 0) - /( 0)), 

where the second equality holds since from Theorem 2.3.2, f has both increasing 
differences and decreasing differences. Therefore, f is separable. I 

In the following we present some examples of supermodular functions, whose 
proof is left as an exercise. 

Theorem 2.3.4 (a) The function f(x,z) = Y^i=i x i z i supermodular on the 

product space 3?” x 3 ft", where x,z € 5ft". 

(b) The Cobb Douglas function f(x) = ■ ■ ■ a;“ n for cti > 0 is supermod- 

ular on the set {x\x = {x\,X 2 , . . . , x n ) > 0}. 

(c) The function f(x, z) = — ^"=i I x i — z i\ p supermodular on (x, z) £ 5 ft 2 " for 
any p > 1. 

(d) If fi(z) is increasing (decreasing) on 5ft for i = 1, 2, . . . ,n, then the function 
f(x) = mkq e r 1 2 ) ... !T j} fi{xi) is supermodular on 3?". 

We now list below some useful properties about supermodular functions. Some of 
these properties are similar to Proposition 2.2.3 that deals with convex functions. 

Proposition 2.3.5 (a) Any positively linear combination of supermodular func- 

tions is supermodular. That is, if fi : 5ft" — > 5ft (i = 1,2, ... ,m) are super- 
modular, then for any scalar on > 0, Yl'iLi a ifi s ^ supermodular. 

(b) If fk is supermodular for k = 1,2,... and lim^oo fk(x) = f(x) for any 
x £ 5ft", then f(x) is supermodular. 

(c) A composition of an increasing (decreasing) convex function and an increas- 
ing supermodular (submodular) function is still supermodular. That is, if 
f : 5ft — > 5ft is convex and nondecreasing (nonincreasing) and g : 5ft" — > 5ft 
is increasing and supermodular (submodular) , then f(g(x)) is supermodular. 
(Notice that g : 5ft" — > 5ft is called increasing, if for any x,x' € 5ft" with 
x < x', g(x) < g( x').) 

(d) Assume that a function /(•,•) is defined in the product space 5ft" x 5ft m . If 
f(-,y) is supermodular for any given y £ 5ft m , then for a random vector ( in 
5ft m , E^[f(x, £)] is supermodular, provided it is well defined. 

(e) Assume that A is a lattice in 5ft" x 5ft m and a function /(•,•) : A — > 5ft is 
supermodular. For any x £ 5ft", let S( x) = {y £ 5ft m | (x,y) € A} and 
II y (A) = {x £ iR n | S(x) yf 0}. If the function 

g(x)= sup f(x, y) 

V&s (x) 

is finite on lattice I I y (A), then g is supermodular over n y (A). 
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Proof. Parts (a-b) and (d) follow directly from the definition of supermodular 
functions. 

We now prove part (c). Assume that g is increasing and supermodular and / is 
convex and nondecreasing. For any x,x' £ 5ft”, since g is increasing, we conclude 
that g( x A x') < g(x),g(x') < g(x V x'). Therefore there exists A, A' € [0, 1] such 
that 

g( x) = (1 — A )g(x A x') + A g(x V x') and g(x') = (1 — A ')g(x A x') + A ' g{x V x'). 

Since g is supermodular, we have that A + X < 1 and 

f( 9 (x)) + f(g(x')) < f(g(xAx')) + f(g(xVx')) 

+ (1 - A - A ’){f{g{x V x')) - f{g(x A a/))) 

< f{g(x Ax')) + f{g(x\/ x 1 )), 

where the first inequality follows from the convexity of function / and the second 
inequality holds since A + A' < 1, / is nondecreasing and g is increasing. Hence 
f{g{x)) is supermodular. Obviously, the above argument holds true when / is 
convex and non-increasing and g is increasing and submodular. 

For part (e), first notice that it is not difficult to show that Yl y (A) is a lattice. 
For any x,x' € fly (A) and any y £ S(x),y' G S(x'), we have, from the definition 
of supermodularity, that 

f(x,y) + f(x',y') < f(x A x’,y A y') + f(x V x',y \J y') 

< g(x A x') + g(x V x'). 

Finally, taking supremum for y £ S(x) and y' G S(x') in the lefthanded side of 
the above inequality, we have 

g{x) + g{x') < g(x A x') + g{x V x'). 

Thus g is supermodular on n. y (A). I 

The following result establishes some connections between convexity and super- 
modularity. 

Theorem 2.3.6 Let X be a lattice in 3?" and ai £ 3? ( i = 1,2, ... ,n). For a 
function f : 3? — > 3?, define g : H n — > 3? with g(x) := /(^”= l a i x i) f or an V 
x = {x\,X 2 , ■ ■ ■ , x n ) £ H n . We have the following. 

(a) If at > 0 for i = 1, 2, . . . , n, and f is convex, then g is supermodular on X. 

(b) If n = 2, a\ > 0 and ci 2 < 0, and f is concave, then g is supermodular on 
X. 

Suppose, in addition, that for any x,x' £ X with x < x' , x £ X implies 
x' £ X, 3? = {£” =1 a i x i | x G X} and f is continuous. 

(c) If n > 2, di > 0 and a 2 > 0, and g is supermodular on X, then f is convex. 
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d) If n > 2, a\ > 0 and 02 < 0, and g is supermodular on X, then g is concave. 

(e) If n> 3, cii > 0 <22 > 0 and <23 < 0, and g is supermodular on X, then f is 
a linear function. 



Proof. First observe that for any x,x' £ 5ft”, 

n n n n 

Y aiXi - Y a i( X i A X i ) = Y ai ^ Xi v X 'i) ~ Y aiX i • 

2=1 2=1 2=1 2=1 

We now prove part (a). Since a* > 0, we have 

n n n n 

Y. a i( X i A X 'i ) - Yj aiXu Yj aiX 'i ~ Yj Ui ( Xi V X i)‘ 



(2.12) 



2=1 2=1 2=1 
Therefore, there exists A, A' G [0,1] such that 



2=1 



y; « 2^2 = (1 - A) ^2 a x') + A ^ di(xi V x'), 



2=1 



2=1 



2 = 1 



and 



r, o>i x'i = (1 - A') ^ a*(x* A x') + A' ^ a»(xi V x'). 



2=1 



2=1 



2=1 



Moreover the equation (2.12) implies A + A' = 1. Thus, from the convexity of /, 
we have that 

g(x) + g(x') = /(Er=i^) + /(Er=i^') 

< /(E”= i «»(*» A ^)) + /(E"=i «i(^ v *f)) 

= j(iAi')+g(a:Va;'). 

Hence <7 is supermodular on X. 

For part (b), we argue that for any x = (211,2:2) and x’ = (x^, x' 2 ), 

f(a !Xi + a 2 x 2 ) - /(a i (a: 1 A x[) + a 2 (x 2 A x' 2 )) 

< f(a 1 (x 1 V x[) + a 2 (x 2 V x' 2 )) - f(a 1 x\ + a 2 x 2 ). 



(2.13) 



If x < x' or x > x', it is obvious that the inequality (2.13) holds true. Assume, 
without loss of generality, that x\ = x\ V x\ and x 2 = x 2 A x 2 . We have, from the 
equation ( 2 . 12 ), that 

22 2 2 

Y a i x i ~ Y Ui ( Xi A X 'i) = Y ai ^ Xi V X i) ~ Y aiX i ~ 0 



2=1 



It is also easy to verify that 



a\X\ + a 2 x 2 > ai(xi V x[) + a 2 (x 2 V x 2 ). 
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Hence Proposition 2.2.6, together with the above inequality, implies the inequality 
(2.13) and thus g is supermodular. 

To prove part (c), fix any z,z' € 3? with z < z' . Choose i £ S" such that 
Elli o-iXi = -2- Let e = (z 1 — z)/(2ai) > 0 and S = (z 1 — z)/(2a2) > 0. Also let 
y = x + ee\, y' = x + Se 2 and x' = x + ee\ + Se 2 , where e, is the unit vector with 0 
at all components except 1 at its zth component. From the definitions of e and 5, 
we have that X^=i a iUi = Y^h=i ai y'i = (2 + z ')/ 2 and 1 a i x \ = z ' ■ Hence the 
supermodularity of g implies that 

f((z + z')/ 2) - 1/2 (f(z) + f(z')) = 1/2 (g(y) + g(y') - g(x ) - g(x')) < 0, 

since x = y A y' and x' = y V y' . Thus, the convexity of the function / follows from 
this inequality and the continuity of /. 

We now prove part (d). Fix any with z < z' . Choose x £ 3?" such that 

£”=1 a i x i — i z + 2 , )/2- Let e = (z 1 — z)/{2ai) > 0 and 8 = — (z' — z)/(2a 2 ) > 0. 
Also let y = x + eei, y' = x + Se 2 and x' = x + ee± + Se 2 . From the definitions of 
e and 8, we have that ^"=1 a ^ = z '’ Z)"=i a iV'i = z ancl Z)"= 1 a i x i = ( z + z 0/ 2 - 
Hence the supermodularity of g implies that 

f((z + z')/2) - 1/2 (f(z) + = 1/2 (g(x) + g(x') - g{y) - g{y')) > 0, 

since x = y A y’ and x l = y V y’ . Thus, the concavity of the function / follows from 
this inequality and the continuity of /. 

Finally, if n > 3, ci\ > 0, a 2 > 0 and <23 < 0, the proof for parts (c) and (d) 
implies that / is both convex and concave, and hence / is linear. I 

We now present the most important property of supermodular functions. This 
result concerns with a collection of optimization problems which are parameterized 
by a parameter. The question is how the set of optimal solutions changes as the 
parameter changes. For this purpose, we define a new concept of increasing set 
function. Let S(t) be a set function in 3?” parameterized by t € T C 3? m , that 
is, for a parameter t € T, S(t) is a subset of 3?”. The set function S(t) is called 
increasing in t, if for any t,t' £ T with t <t', x € S(t ) and x' € S(t'), we have 
that x A x' € S(t) and xWx'G S(t'). 

Let T be a subset in 3 and A := {( x,t ) | t € T, x € S(t)} C 3?” x 3? m . Let 
S*(t) := argmax a . eS(t) 5(a:,t). 

Theorem 2.3.7 Assume that g(x,t) ■ A — > 3? is supermodular on A and S(t) is 
increasing. 

(a) S*(t) is increasing in t on {t G T \ S*(t) ^ 0}. 

(b) Assume, in addition, that S(t) is a compact set of 3?" for any t G T, and 
f(x,t) is continuous in x on Sit) for any t G T. Then S*{t) is nonempty and 
there exist x(t),x'{t) G S*(t) such that for any x G S*(t), x'(t) < x < x{t). 
Furthermore, x{t) and x'(t) are increasing. 




2.4 Exercises 
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Proof. To prove part (a), pick any t,t' £ T with t < t' such that S*(t) and S*(t') 
are nonempty. For any x £ S*(t) and x ’ £ S*(t'), we have that x Ax' £ S(t) and 
iVi'e S(t'), since S(t) is increasing in t. The supermodularity of g implies that 

0 > g(x A x' , t) — g(x, t) > g{x' , if) — g{x V x' , t') > 0. 

Thus x Ax' £ S*(t) and x V x' £ S*(t'), which implies that S*(t) is increasing in 
ton {teT | S*(t)^0}. 

For part (b), we hrst prove the existence of the largest element, x(t), in S*(t) 
for any fixed t £ T, i.e., x < x(t) for any x £ S*(t), and x(t) is increasing. Since 
f(x, t) is continuous in x on a compact set S(t), S*(t ) is nonempty and compact for 
any t £ T. Let x* = (x\,X 2 , ■ ■ ■ , x*) £ 3?" with x* = sup^gg.^ Xi, i = 1,2, ... ,n. 
Since S*(t) is compact, there exists x l £ S*(t) such that x\ = x*, i = 1,2, ... ,n. 
Hence x* = x 1 V x 2 V ... V x n £ S*(t ) since S*(t) is increasing from part (a). 
Obviously, x* is the largest element of S*(t). 

We now show that the largest element, x(t), of S*(t) is increasing. For any t, if £ 
T with t < t', we have x(t') < x(t)\/x(t') € S*(t'). Hence x(t') = x(t)\/x(t') > x(t). 
The properties regarding the smallest element, x'(t), in S*(t) can be established 
similarly. I 



2.4 Exercises 



Exercise 2.1. Assume that a function / : 3?" — » 3? is continuous. Prove that / is 
convex if and only if for any x,x' € 3?", /(^yp-) < (/( x) + f{x'))/2. 

Exercise 2.2. Let Z be the set of all integers in 3?. Prove that a function / is 
convex on Z if and only if either of the following two conditions holds. 

(a) A f(x) is nondecreasing, where A f(x) = f(x + 1) — f{x). 

(b) There exists a convex function g on 3? such that g(x) = /( x) for all x £ Z. 
In other word, g is a convex extension of /. 



Exercise 2.3. (Private Communication with Peng Sun) Assume that / : 3? — > 3? is 
convex, and random variables X\, X^, ■ ■ ■ , are nonnegative and independently and 
identically distributed. Prove that A,)] is convex on the set of natural 

numbers. 

Exercise 2.4. Prove Theorem 2.3.4. 

Exercise 2.5. Assume that a function /(•,•) is defined on the product space 
3?" x 3? m and /(•, y) is convex for any given y £ 3 Let £ be a random vector in 
3? m . Prove the following. 
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(a) The function e x is strictly increasing and convex. 

(b) The function E^[exp(w + f(x, £))] is jointly convex in x and w. 

(c) The function In (E^[exp(/(;r, £))]) is convex. 



Exercise 2.6. Assume that A is a lattice in the product space 3? n x 9? m . Prove 
that the set function S(t) = {x € $l n \(x,t) G A} is increasing on the set {t. G 
5R m |5(t) yf 0}. 




3 

Worst- Case Analysis 



3.1 Introduction 

Since most complicated logistics problems, for example, the Bin-Packing Problem 
and Traveling Salesman Problems, are A/'T’-Hard it is unlikely that polynomial time 
algorithms will be developed for their optimal solutions. Consequently, a great deal 
of work has been devoted to the development and analyses of heuristics. In this 
chapter we demonstrate one important tool, referred to as worst-case performance 
analysis, which establishes the maximum deviation from optimality that can occur 
for a given heuristic algorithm. We will characterize the worst-case performance of 
a variety of algorithms for the Bin-Packing Problem and the Traveling Salesman 
Problem. The results obtained here serve as important building blocks in the 
analysis of algorithms for vehicle routing problems. 

Worst-case effectiveness is essentially measured in two different ways. Take a 
generic problem, and let I be a particular instance. Let Z*(I) be the total cost of 
the optimal solution, for instance I. Let Z H (/) be the total cost of the solution 
provided by the heuristic H on instance I . Then, the absolute performance ratio 
of heuristic H is defined as: 

I? H = inf jr > 1 | < r, for all /}. 

This measure, of course, is specific to the particular problem. The absolute per- 
formance ratio is often achieved for very small problem instances. It is therefore 
desirable to have a measure that takes into account problems of large size only. 
This measure is the asymptotic performance ratio. For a heuristic H, this ratio is 
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defined as: 



#00 = inf • 



r Z B (I) 'i 

|r > 1 | 3n such that ^ < r, for all I with Z*(I) > n|. 



Z*{I) 



This measure sometimes gives a more accurate picture of a heuristic’s performance. 
Note that < i? H . 

In general, it is important to also show that no better worst-case bound (for a 
given heuristic) is possible. This is usually achieved by providing an example, or 
family of examples, where the bound is tight, or arbitrarily close to tight. 

In this chapter, we will analyze several heuristics for two difficult problems, 
the Bin-Packing Problem and the Traveling Salesman Problem, along with their 
worst-case performance bounds. 



3.2 The Bin-Packing Problem 



The Bin-Packing Problem (BPP) can be stated as follows: given a list of n real 
numbers L = (wi, W 2 , ■ . ■ , w n ), where we call w t £ (0,1] the size of item i, the 
problem is to assign each item to a bin such that the sum of the item sizes in a 
bin does not exceed 1, while minimizing the number of bins used. For simplicity, 
we also use L as a set, but this should cause no confusion. In this case, we write 
i £ L to mean w), £ L. 

Many heuristics have been developed for this problem since the early 1970s. 
Some of the more popular ones are First-Fit (FF), Best-Fit (BF), First-Fit De- 
creasing (FFD) and Best-Fit Decreasing (BFD) analyzed by Johnson et al. (1974). 
First-Fit and Best-Fit assign items to bins according to the order they appear in 
the list without using any knowledge of subsequent items in the list; these are 
online algorithms. First Fit can be described as follows: place item 1 in bin 1. Sup- 
pose we are packing item j; place item j in the lowest indexed bin whose current 
content does not exceed 1 — Wj. The BF heuristic is similar to FF except that it 
places item j in the bin whose current content is the largest but does not exceed 
1 — wj. In contrast to these heuristics, FFD first sorts the items in non increasing 
order of their size and then performs FF. Similarly, BFD first sorts the items in 
non-increasing order of their size and then performs BF. These are called offline 
algorithms. 

Let & H (L) be the number of bins produced by a heuristic H on list L. Similarly, 
let b*(L) be the minimum number of bins required to pack the items in list L; that 
is, b*(L) is the optimal solution to the bin-packing problem defined on list L. 

The best asymptotic performance bounds for the FF and BF heuristics are given 
in Garey et al. (1976) where they show that 



b FF (L)< \^b*(L) 



and 



b BF (L)<\^b*(L) 




3.2 The Bin-Packing Problem 35 



Here \x\ is defined as the smallest integer greater than or equal to x. 

The best asymptotic performance bounds for FFD and BFD have been obtained 
by Baker (1985) who shows that 

b FFD (L)<^b*(L) + 3, 

and 

b BFB (L) < ^-b*(L) + 3. 

9 

Johnson et al. (1974) provide instances with arbitrarily large values of b*(L) such 

^FF (JA ^BF /£\ 1 « _ ^FFD/^-n 

that the ratios b , ^ and approach and instances where b*(L) anc ^ 

b-fjXj approach / . Thus, the maximum deviation from optimality for all lists 
that are sufficiently “large” is no more than 70% times the minimal number of 
bins in the case of FF and BF, and 22.2% in the case of FFD and BFD. 

We now show that by using simple arguments one can characterize the absolute 
performance ratio for each of the four heuristics. We start however by demon- 
strating that in general we cannot expect to find a polynomial time heuristic with 
absolute performance ratio less than |. 

Lemma 3.2.1 Suppose there exists a polynomial time heuristic H for the BPP 
with R h < 3/2; then V = MV . 

Proof. We show that if such a heuristic exists, then we can solve the AfP-Complete 
2-Partition Problem in polynomial time. This problem is defined as follows: given 
a set A = {ai,a 2 , . . . , a„}, does there exist an Ai C A such that Y2a i eA 1 a i = 
SaiSA\Ai a i^ 

For a given instance A of 2-Partition we construct an instance L of the bin- 
packing problem with items sizes a* and bins of capacity \ a n . Observe that if 
there exists an Ai such that a i = 4 \Ai a * = \ ]C a then the heuristic H 

must find a solution such that b H (L ) = 2. On the other hand, if there is no such 
Ai in the 2-Partition Problem, then the corresponding Bin-Packing Problem has 
no solution with less than 3 bins and hence b B (L) > 3. 

Consequently, to solve the 2-Partition Problem, apply the heuristic H to the 
corresponding bin-packing problem. If b B (L) > 3, there is no subset A\ with the 
desired property. Otherwise there is one. Since 2-Partition is M P-Complete , this 
implies V = MV. I 

Let XF be either FF or BF and let XFD be either FFD or BFD. In this section 
we prove the following result due to Simchi-Levi (1994). 

Theorem 3.2.2 For all lists L, 

b XF {L) ^ 7 
b*(L) ~ T 

6 xfd (L) ^ 3 
b*{L) ~ 2' 



and 
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In view of Lemma 3.2.1 it is clear that FFD and BFD have the best possible 
absolute performance ratios for the Bin-Packing Problem, among all polynomial 
time heuristics. As Garey and Johnson (1979, p. 128) point out, it is easy to 
construct examples in which an optimal solution uses 2 bins while FFD or BFD 
uses 3 bins. Similarly, Johnson et al. give examples in which an optimal solution 
uses 10 bins while FF and BF use 17 bins. Thus, the absolute performance ratio 
for FFD and BFD is exactly | while it is at least 1.7 and no more than ? for FF 
and BF. 

We now define the following terms which will be used throughout this section. 
An item is called large if its size is (strictly) greater than 0.5; otherwise it is called 
small. Define a bin to be of type I if it has only small items, and of type II if it is 
not a type I bin; that is, it has at least one large item in it. A bin is called feasible 
if the sum of the item sizes in the bin does not exceed 1. An item is said to fit in a 
bin if the bin resulting from the insertion of this item is a feasible bin. In addition, 
a bin is said to be opened when an item is placed in a bin that was previously 
empty. 

3.2.1 First- Fit and Best- Fit 

The proof of the worst-case bounds for FF and BF, the first part of Theorem 3.2.2, 
is based on the following observation. Recall XF=FF or BF. 

Lemma 3.2.3 Consider the j th bin opened by XF (j >2). Any item that was 
assigned to it before it was more than half full does not fit in any bin opened by 
XF prior to bin j. 

Proof. The property is clearly true for FF, and in fact holds for any item assigned 
to the j th bin, j > 2, not necessarily to items assigned to it before it was more 
than half full. To prove the property for BF, suppose by contradiction, item i was 
assigned to the j th bin before it was more than half full, and this item fits in one 
of the previously opened bins, say the k th bin. Clearly, in that case, i cannot be 
the first item assigned to the j th bin since BF would not have opened a new bin 
if i fits in one of the previously opened bins. Let the levels of bins k and j, just 
before the time item i was packed by BF, be a j~ and ay and let item h be the first 
item in bin j. Hence Wh < ay < g by the hypothesis. Since BF assigns an item to 
the bin where it fits with the largest content, and item i would have fit in bin k, 
we have ay > a*,. Thus, a*, < \ meaning that item H would have fit in bin k, a 
contradiction. I 

We use Lemma 3.2.3 to construct a lower bound on the minimum number of 
bins. For this purpose, we introduce the following procedure. For a given integer v, 
2 < v < b XF (L), select v bins from those produced by XF. Index the v bins in the 
order they are opened starting with 1 and ending with v. Let Ay be the set of items 
assigned by XF to the j th bin before it was more than half full, j = 1 , 2 , ... ,v. Let 
Si be the set of items assigned by XF to the ? th bin, j = 1, 2, .... v. Observe that 
Ay C Sj for all j = 1, 2, . . . , v. 
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Procedure LBBP (Lower Bound Bin-Packing) 

Step 1: Let X[ = X t , i = 1,2, ... ,v. 

Step 2: For * = 1 to v — 1 do 

Let j = max{fc : X' k ^ 0}. 

If j < i, stop. 

Else, let u be the smallest item in X' :j . 

Set Si Si U {it} and X) <— Xj\{u}. 



In view of Lemma 3.2.3 it is clear that Procedure LBBP generates nonempty 
subsets Si, 5*2, ... , S rn , for some in < v, such that Y^ies- W i > 1 f° r j < rn — 1 and 
possibly for j = m. This is true since by Lemma 3.2.3 item u (as defined in the 
LBBP procedure), originally assigned to bin j before it was more than half full, 
does not fit in any bin i with i < j. Then the following must hold. 

Lemma 3.2.4 max {| (J " =m+1 Xj\, m - l} < £J =1 E ie s 3 - w i- 

Proof. Since bins 1, 2, . . . , m — 1 generated by Procedure LBBP are not feasible, 
we have X^=i Ylies- Wi > 171 ~ 1- Note that every item in Uj'= TO +i Xj is moved by 
Procedure LBBP to exactly one Sj, j = 1, 2, . . . , m — 1 and possibly to S m . Thus, 
if S m is feasible, that is, no (additional) item is assigned by Procedure LBBP to 
S m , then | Uj=m+i Xj \ < in— 1 < Wi . On the other hand, if an item is 

assigned by Procedure LBBP to S rrl , then none of the subsets Sj . j = 1,2,..., m, 
are feasible and therefore m = \ Uj =TO+ i Xj\ < J2j=i w i- * 

We are now ready to prove the first part of Theorem 3.2.2, that is, establish the 
upper bound on the absolute performance ratio of the XF heuristic. Let c be the 
number of large items in the list L. Without loss of generality, assume b XF (L) > c 
since otherwise the solution produced by XF is optimal. So, b XF (L ) — c > 0 is the 
number of type I bins produced by XF. We consider the following two cases. 

Case 1: c is even. In this case we partition the bins produced by XF into two sets. 
The first set includes only type I bins while the second set includes the remaining 
bins produced by XF, that is, all the type II bins. Index the bins in the first set in 
the order they are opened, from 1 to b XF (L) — c. Let v = b XF (L ) — c, and apply 
Procedure LBBP to the set of type I bins, producing m bins out of which at least 
in — 1 are infeasible. Then: 

Lemma 3.2.5 If c is even, 

max + m, 2 {b XF (L) — in) — < b*{L). 

Proof. Combining Lemma 3.2.4 with the fact that no two large items fit in the 
same bin we have ^2 ieL Wi > rn— 1+ 1. On the other hand, every bin in an optimal 
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solution is feasible and therefore J2zgl w i — b*{L). Since c is even, m+ | < b*(L). 
Since we applied Procedure LBBP only to the type I bins produced by XF, each 
one of these bins has at least two items except possibly one which may have only 
one item. Hence, 2(6 XF (L) — to — c — 1) + 1 < | U / =m +i Xj I an d therefore, using 
Lemma 3.2.4, 

2(6 xf (L) — ?n — c — 1) + ^ + 1 < ^ Wi < b*(L ), 

2 iGL 

or 

2(6 xf (L) -m-c- l) + ^+ 2< b*(L). 

Rearranging the left-hand side gives the second lower bound. I 

Theorem 3.2.6 If c is even, 



b XF (L) < 7 -b*(L). 



Proof. From Lemma 3.2.5 we have 2 (b XF (L) — m) — ^ < b*(L). Hence, 



b xt (L) < 



b*(L) 

2 

b*{L) 



3c 

4~ 



2 +( m +2 } 



7. 



< ~ A b*{L), 



c 

4 



since rn + | , b*(L) and c are lower bounds. I 

Case 2: c is odd. In this case we partition the set of all bins generated by the XF 
heuristic in a slightly different way. The first set of bins, called B\, comprise all 
the type I bins except the last type I bin opened by XF. The second set is made 
up of the remaining bins; that is, these are all the type II bins together with the 
type I bin not included in B\. We now apply procedure LBBP to the bins in B\ 
(with v = b XF (L ) — c — 1), producing m bins out of which at least m — 1 bins are 
not feasible. 

Lemma 3.2.7 If c is odd, 

max {^+m+i, 2(6 xf (L) -m) - y - < b*{L). 



Proof. Take one of the type II bins and “match” it with the only type I bin not 
in Bi, the total weight of these two bins is more than 1. Thus, using Property 2.2, 
we have + 1 + (m — 1) < J2igl Wi — b*(L) which proves the first lower bound. 
To prove the second lower bound, we use the fact that every bin in B\ has at least 
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2 items and therefore 2 (b XF (L) — m — c— 1) < j U"= m +i Xj I- Using Property 2.2, 
we get 

2 (6 xf (L) — m — c — 1) H b 1 < ^2 Wi < b*(L), 

i£L 

or 

2(6 xf (L) - to - c - 1) + + 2 < b*(L). 

Rearranging the left-hand side gives the second lower bound. I 

Theorem 3.2.8 If c is odd, 

6 XF (U) < 7 -b*(L) - J. 

\ < b*{L). Hence, 

1 
" 4 



Proof. From Lemma 3.2.7 we have 2(6 X1 * (L) — m) — 4r — 



& xi * (L) < 



6%L) 

2 

6*(L) 



3c 

4 



( m OO) 



7 1 

< -b*(L) - -. 
- 4 V ; 4 



3 . 2.2 First-Fit Decreasing and Best-Fit Decreasing 

The proof of the worst-case bounds for FFD and BFD is based on Lemma 3.2.3. 
This lemma states that if a bin produced by these heuristics contains only items 
of size at most then the first two items assigned to the bin cannot fit in any bin 
opened prior to it. 

Let XFD denote either FFD or BFD. Index the bins produced by XFD in the 
order they are opened. We consider three cases. First, suppose 6 XFD (L) = 3 p for 
some integer p > 1. Consider the bin with index 2p + 1. If this bin contains a 
large item we are done, since in that case b*(L) >2 p = | b XFU (L). Otherwise, bins 
2p + 1 through 3 p must contain at least 2p — 1 small items, none of which can fit 
in the first 2 p bins. Hence, the total sum of the item sizes exceeds 2 p — 1, meaning 
that b*(L) >2p= |6 XFD (L). 

Suppose 6 xfd (L) = 3p + 1. If bin 2p + 1 contains a large item we are done. 
Otherwise, bins 2p + 1 through 3p + 1 contain at least 2p + 1 small items, none 
of which can fit in the first 2 p bins, implying that the total sum of the item sizes 
exceeds 2 p and hence b*(L) >2p+l> 1 6 XFD (L). 

Similarly, suppose & XFD (L) = 3p + 2. If bin 2p + 2 contains a large item we are 
done. Otherwise, bins 2p + 2 through 3p + 2 contain at least 2p + 1 small items, 
none of which can fit in the first 2p + 1 bins, implying the sum of the item sizes 
exceeds 2p+ 1 and hence b*(L) >2p + 2> 1 6 XFD (L). 
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3.3 The Traveling Salesman Problem 



Interesting worst-case results have been obtained for another combinatorial prob- 
lem that plays an important role in the analysis of logistics systems: the Traveling 
Salesman Problem (TSP). The problem can be defined as follows: Let G = (V,E) 
be a complete undirected graph with vertices V, \V\ = n, and edges E and let 
dij be the length of edge (We use the term length to designate the “cost” of 
using edge (i. j). The most general formulation of the TSP allows for completely 
arbitrary “lengths” and, in fact, in many applications the physical distance is ir- 
relevant and the dij simply represents the cost of sequencing j immediately after 
i.) The objective in the TSP is to find a tour that visits each vertex exactly once 
and whose total length is as small as possible. The problem has been analyzed 
extensively in the last three decades; see Lawler et al. (1985) for an excellent sur- 
vey and, in particular, the chapter written by Johnson and Papadimitriou (1985) 
which includes some of the worst-case results presented here. 

We shall examine a variety of heuristics for the TSP and show that, for an 
important special case of this problem, heuristics with strong worst-case bounds 
exist. We start however with a negative result, due to Sahni and Gonzalez (1976), 
which states that in general finding a heuristic for the TSP with a constant worst- 
case bound is as hard as solving any A/'P-Complete problem, no matter what the 
bound. 

To present the result, let I be an instance of the TSP. Let L*(I) be the length 
of the optimal traveling salesman tour through V. Given a heuristic H, let L H (/) 
be the length of the tour generated by H. 

Theorem 3.3.1 Suppose there exists a polynomial time heuristic H for the TSP 
and a constant R H such that for all instances I 



then V = AfV . 



L H (I) 

L*(I) 



< R 



H. 



Proof. The proof is in the same spirit as the proof of Lemma 3.2.1. Suppose 
such a heuristic exists. We will use it to solve the A/P-Complete Hamiltonian 
Cycle Problem in polynomial time. The Hamiltonian Cycle Problem is defined as 
follows. Given a graph G = (V,E), does there exist a simple cycle (a cycle that 
does not visit a point more than once) in G that includes all of V? To answer this 
question we construct an instance / of the TSP and apply H to it; the length of 
the tour generated by H will tell us whether G has a Hamiltonian cycle. 

The instance I is defined on a complete graph whose set of vertices is V and 
the length of each edge {i,j} is 

d _ ( !> if {i,j } € E: 

y \ |P|i? H , otherwise. 
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We distinguish between two cases depending on whether G contains a Hamilto- 
nian cycle. If G does not contain a Hamiltonian cycle, then any traveling salesman 
tour in I must contain at least one edge with length |P|f? H and hence the length 
of the tour generated by H is at least |H|I? h + \V\ — 1. 

On the other hand, if G has a Hamiltonian cycle, then / must have a tour of 
length \V\. This is true since we can use the Hamiltonian cycle as a traveling 
salesman tour for the instance / in which the vertices appear on the traveling 
salesman tour in the same order they appear in the Hamiltonian cycle. Thus, if G 
has a Hamiltonian cycle, heuristic H applied to / must provide a tour of length no 
more than | W| .R 11 . 

Consequently, we have a method for solving the Hamiltonian Cycle Problem: 
apply H to the TSP defined on the instance I. If L H (/) < |P|I? H , then there exists 
a Hamiltonian cycle in G. Otherwise, there is no such cycle in G. Finally, since H 
is assumed to be polynomial, we conclude that V = MV . I 

The theorem thus implies that it is very unlikely that a polynomial time heuristic 
for the TSP with a constant absolute worst-case bound exists. However, there is an 
important version of the Traveling Salesman Problem that excludes the above neg- 
ative result. This is when the distance matrix {d^} satisfies the triangle inequality 
assumption. 

Definition 3.3.2 A distance matrix satisfies the triangle inequality assumption if 
for all i,j , k £ V we have dij < dik + dkj- 

In many logistics environments, the triangle inequality assumption is not a very 
restrictive one. It merely states that traveling directly from point (vertex) i to 
point (vertex) j is at most the cost of traveling from i to j through the point k. 

In the next four sections we describe and analyze different heuristics developed 
for the TSP. To simplify presentation in what follows, we write L* instead of L*(/); 
this should cause no confusion. 

3.3.1 A Minimum Spanning Tree Based Heuristic 

The following algorithm provides a simple example of how a fixed worst-case bound 
is possible for the TSP when the distance matrix satisfies the triangle inequality 
assumption. In this case, the bound is 2; that is, the heuristic provides a solution 
with total length at most 100% above the length of an optimal tour. 

A spanning tree of a graph G = (V, E) is a connected subgraph with \V\ — 1 
edges spanning all of V. The cost (or weight) of a tree is the sum of the length of 
the edges in the tree. A minimum spanning tree (MST) is a spanning tree with 
minimum cost. It is well known and easy to show that a minimum spanning tree 
can be found in polynomial time (see, for example, Papadimitriou and Steiglitz 
(1982)). If W* denotes the weight (cost) of the minimum spanning tree, then we 
must have W* < L* since deleting any edge from the optimal tour results in a 
spanning tree. 

The minimum spanning tree can be used to find a feasible traveling salesman 
tour in polynomial time. The idea is to perform a depth-first search (see Alro et 
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al. (1974)) over the minimum spanning tree and then to do simple improvements 
on this solution. Formally, this is done as follows (Johnson and Papadimitriou, 
1985). 

A Minimum Spanning Tree Based Heuristic 

Step 1 : Construct a minimum spanning tree and color its edges white, and all other 
edges black. 

Step 2: Let the current vertex (denoted v ) be an arbitrary vertex. 

Step 3: If one of the edges adjacent to v in the MST is white, color it black and 
proceed to the vertex at the other end of this edge. Else (all edges from v 
are black), go back along the edge by which the current vertex was originally 
reached. 

Step 4 : Let this vertex be v. Stop if v is the vertex you started with and all edges of 
MST are black. Otherwise go to Step 3. 



Observe that the above strategy produces a tour that starts and ends at one 
of the vertices and visits all other vertices in the graph covering each arc twice. 
This is not a very efficient tour since some vertices may be visited more than once. 
To improve on this tour, we can modify the above strategy as follows: instead of 
going back to a visited vertex, we can use a shortcut strategy in which we skip 
this vertex, and go directly to the next unvisited vertex. The triangle inequality 
assumption implies that the above modification will not increase the length of the 
tour, and in fact may reduce it. 

Let L mst be the length of the traveling salesman tour generated by the above 
strategy. We clearly have 

L mst < 2 W* < 2 L*, 

where the first inequality follows since without shortcuts the length of the tour is 
exactly 2 IT*. This proves that the worst case bound of the algorithm is at most 2. It 
remains to verify that the worst case bound of this heuristic cannot be improved. 
For this purpose consider Figure 3.1, the example constructed by Johnson and 
Papadimitriou (1985). Here, W* = § + § (1 - e) + 2e - 1, L MST » ^ + ^(1 - e), 
and L* = 

3.3.2 The Nearest Insertion Heuristic 

Before describing this heuristic, consider the following intuitively appealing strat- 
egy, called the Nearest Neighbor Heuristic. Given an instance / of the TSP, start 
with an arbitrary vertex and find the vertex not yet visited that is closest to the 
current vertex. Travel to this vertex. Repeat this until all vertices are visited; then 
go back to the starting vertex. 




3.3 The Traveling Salesman Problem 43 



FIGURE 3.1. An example for the minimum spanning tree based algorithm with n = 18. 
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Unfortunately, Rosenkrantz et al. (1977) show the existence of a family of in- 
stances for the TSP with arbitrary n with the following property. The length of 
the tour generated by the Nearest Neighbor Heuristic on each instance in the fam- 
ily is O(logrc) times the length of the optimal tour. Thus, the Nearest Neighbor 
Heuristic does not have a bounded worst-case performance. 

This comes as no surprise since the algorithm obviously suffers from one major 
weakness. This “greedy” strategy tends to begin well, inserting very short arcs 
into the path, but ultimately it ends with arcs that are quite long. For instance, 
the last edge added, the one connecting the last node to the starting node, may be 
very long due to the fact that at no point does the heuristic consider the location 
of the starting vertex and possible ending vertices. 

One way to improve the performance of the Nearest Neighbor Heuristic is pre- 
sented in the following variant, called the Nearest Insertion (NI) Heuristic, devel- 
oped and analyzed by Rosenkrantz et al. Informally, the heuristic works as follows: 
at each iteration of the heuristic a Hamiltonian cycle containing a subset of the 
vertices is constructed. The heuristic then selects a new vertex not yet in the cycle 
that is “closest” in a specific sense and inserts it between two adjacent vertices in 
the cycle. The process stops when all vertices are in the cycle. Formally, this is 
done as follows. 

The Nearest Insertion Heuristic 

Step 1: Choose an arbitrary node v and let the cycle C consist of only v. 

Step 2: Find a node outside C closest to a node in C; call it k. 

Step 3: Find an edge {?’,.?} in C such that dik + dkj — dij is minimal. 

Step f: Construct a new cycle C by replacing {i,j} with {i, k} and {k,j}. 

Step 5: If the current cycle C contains all the vertices, stop. Otherwise, go to Step 

2. 

Let L ni be the length of the solution obtained by the Nearest Insertion Heuristic. 
Then: 

Theorem 3.3.3 For all instances of the TSP satisfying the triangle inequality, 

L m < 2 L*. 

We start by proving the following interesting result. Let T be a spanning tree 
of G and let W(T) be the weight (cost) of that tree; that is, W(T) is the sum of 
the length of all edges in the tree T . Then: 

Lemma 3.3.4 For every spanning tree T, 

L m < 2W(T). 
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Proof. We prove the lemma by matching each vertex we insert during the execution 
of the algorithm with a single edge of the given tree T . To do that we describe a 
procedure that will be carried out in parallel to the Nearest Insertion Heuristic. 

The Dual Nearest Insertion Procedure 

Step 1: Start with a family T of trees that, at first, consists of only the tree T . 

Step 2: Given k (the vertex selected in Step 2 of NI), find the unique tree in T 
containing k. Let this tree be T 

Step 3: Let t be the unique vertex in T*. that is in the current cycle. 

Step 4 : Let h be the vertex adjacent to £ on the unique path from i to k. Replace 
Tfc in T by two trees obtained from Xj, by deleting edge {£. h}. 

Step 5: If T contains n trees, stop. Otherwise, go to Step 2. 

The Dual Nearest Insertion procedure is carried out in parallel to the Nearest 
Insertion Heuristic in the sense that each time Step 1 is performed in the lat- 
ter procedure, Step 1 is performed in the former procedure. Each time Step 2 is 
performed in the latter, Step 2 is performed in the former, etc. 

Observe that each time Step 4 of the Dual Nearest Insertion procedure is per- 
formed, the set of trees T is updated so that each tree in T has exactly one vertex 
from the current cycle and each vertex of the current cycle belongs to exactly one 
tree. This is true since when edge {£. h} is deleted, two subtrees are constructed, 
one containing the vertex £ and the other containing the vertex k. Edge {£, h} is 
the one we associate with the insertion of vertex k. 

Let to be the vertex in the current cycle to which vertex k (not in the cycle) 
was closest. That is, to is the vertex such that dkm is the smallest among all d uv 
where u is in the cycle and v outside the cycle. Let to + 1 be one of the vertices 
on the cycle adjacent to to. Finally, let edge {i,j} be the edge deleted from the 
current cycle. Clearly, inserting k into the current cycle increases the length of the 
tour by 

dik T dkj dij £ d m k T dk^m+i 1 ^ 2d m k, 

where the left-hand inequality holds because of Step 3 of the Nearest Insertion 
Heuristic and the right-hand inequality holds in view of the triangle inequality 
assumption. This of course is true only when the cycle contains at least two vertices. 
When it contains exactly one vertex, that is, when the Nearest Insertion algorithm 
enters Step 2 for the first time, inserting k to the current cycle increases the length 
of the tour by exactly 2 d m k. 

Since l is in the current cycle and h is not, d m k < den ■ Hence, the increase in 
the cost of the current cycle is no more than 2 di.h- Finally, since this relationship 
holds for every edge of T and the corresponding inserted vertex, we have 

X NI < 2W(X). 
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FIGURE 3.2. An example for the nearest insertion algorithm with n = 8. 



To finish the proof of Theorem 3.3.3, apply Theorem 3.3.4 with T*; thus, 

W* = W(T*) < L* < L ni < 2 W(T*). 

This completes the proof of the Theorem. 

To see that the bound is tight consider the example (constructed by Rosenkrantz 
et al., 1977) depicted in Figure 3.2. In this example, the length of every edge 
connecting two consecutive vertices on the perimeter is 1 while all other edges have 
length 2. Thus, the optimal traveling salesman tour visits the vertices according 
to their appearance on the circle and therefore L* = n. It is easy to see that the 
Nearest Insertion Heuristic generates the tour depicted in Figure 3.2(b) with cost 
L ni = 2n - 2. 

3. 3. 3 Christofides ’ Heuristic 

In 1976, Christofides presented a very simple algorithm that currently has the best 
known worst-case performance bound for the TSP. To present the algorithm we 
need to state several properties of graphs. 

Lemma 3.3.5 Given a graph with at least two vertices, the number of vertices 
with odd degree is even. 

Definition 3.3.6 A Eulerian Tour is a tour that traverses all edges of a graph 
exactly once. 

Definition 3.3.7 A Eulerian Graph is a graph that has a Eulerian Tour. 

Then it is a simple exercise to show the following. 

Lemma 3.3.8 A connected graph is Eulerian if and only if the degree of each 
vertex is even. 
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Christofides’ algorithm starts with a minimum spanning tree. Of course, this 
tree (as any other tree) is not Eulerian since some of the vertices have odd degree. 
We can augment the graph (by adding suitably chosen arcs) so that it becomes 
Eulerian. In fact, we would like to add a number of arcs connecting odd degree 
vertices so that they then have even degree. To do this, we will find a minimum 
weight matching among the odd degree vertices. 

Given a graph with an even number of vertices, a matching is a subset of edges 
with the property that every vertex is the end-point of exactly one edge of the 
subset. In the minimum weight matching problem the objective is to find a match- 
ing whose total length of all its edges is minimum. This problem can be solved in 
0(n 3 ) where n is the number of vertices in the graph (see Lawler (1976)). 

Lemma 3.3.5 tells us that the number of vertices with odd degree in the MST is 
even. Thus, adding the edges of a matching defined on those odd degree vertices 
clearly increases the degree of each of these vertices by one. The resulting graph 
is Eulerian, by Lemma 3.3.8. Of course, to minimize the total cost, we would 
like to select the edges of a minimum weight matching. Finally, the Eulerian tour 
generated is transformed into a traveling salesman tour using shortcuts, similarly 
to what was done in the minimum spanning tree based heuristic of Section 3.3.1. 

Let L c be the length of the tour generated by Christofides’ Heuristic. We prove: 

Theorem 3.3.9 For all instances of the TSP satisfying the triangle inequality, 
we have 

L c < -L*. 

~ 2 

Proof. Recall that W* = W{T*) is the cost of the MST and let W(M*) be the 
weight of the minimum weight matching, that is, the sum of edge length of all 
edges in the optimal matching. Because of the triangle inequality assumption, 

L c < W(T*) + W(M*). 

We already know that W(T*) < L*. It remains to show that W(M*) < \L* . 
For this purpose index the vertices of odd degree in the minimum spanning tree 
ii, * 2 , . . . , * 2 fc according to their appearance on an optimal traveling salesman 
tour. Consider two feasible solutions for the minimum weight matching prob- 
lem defined on these vertices. The first matching, denoted M 1 , consists of edges 
{*i , * 2 } , {* 3 > * 4 }, • • • , {* 2 fc-i, * 2 fc}- The second matching, denoted M 2 , consists of 
edges {* 2 , * 3 }, {* 4 , * 5 }, • • • , {* 2 fc, h}- 

We clearly have W (M*) < | [W (M 1 )+W ( M 2 )] . The triangle inequality assump- 
tion tells us that W(M x ) + W(M 2 ) < L*\ see Figure 3.3. Hence W(M*) < |L* 
and consequently, 

L* < W(T*) + W(M*) < ^L*. 

I 

As in the two previous heuristics, this bound is tight. Consider the example 
depicted in Figure 3.4 for which L* — n while L c = n — 1 + 
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FIGURE 3.3. The matching and the optimal traveling salesman tour. 



FIGURE 3.4. An example for Christofides’ algorithm with n = 7. 
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3.3.4 Local Search Heuristics 

Some of the oldest and, by far, the most extensively used heuristics developed for 
the traveling salesman problem are the so-called k— opt procedures (k > 2). These 
heuristics, part of the extensive class of local search procedures, can be described 
as follows. Given a traveling salesman tour through the set of vertices V, say the 
sequence 

{i 1 , ri • • • 5 iui ; iu2 5 * ■ • > Li > iv2 > * • * 5 in 

an (.— exchange is a procedure that replaces l edges currently in the tour by ( 
new edges so that the result is again a traveling salesman tour. For instance, a 
2— exchange procedure replaces edges {i Ul riu 2 } an d {i Vl Sv 2 } with } and 

{iu 2 riv 2 } an( i results in a new tour 



{i 1 , • • • Sun rivi — li • • • ; iu2 5 iv2 ? A2 + I ) * • • Gro} • 



An improving t— exchange is an l— exchange that results in a tour whose total 
length (cost) is smaller than the cost of the original tour. 

A k— opt procedure starts from an arbitrary traveling salesman tour and, using 
improving t— exchanges, for l < k, successively generates tours of smaller and 
smaller length. The procedure terminates when no improving t — exchange is found 
for all ( < k. Let L OPT ( fc ) be the length of the tour generated by a k— opt heuristic, 
for k > 2. 

Recently, Chandra et al. (1995) obtained interesting results on the worst-case 
performance of the k— opt heuristic. They show 

Theorem 3.3.10 For all instances of the TSP satisfying the triangle inequality 
we have 

L OPT(2) 

< 

L* 

In addition, there exists an infinitely large family of TSP instances satisfying the 
triangle inequality assumption for which 



L OPT(2) 

L* 



1 

> - 
“ 4 



/n. 



They also provide a lower bound on the worst-case performance of k— opt for 
all k > 3. 



Theorem 3.3.11 There exists an infinitely large family of TSP instances satis- 
fying the triangle inequality assumption with 

L OPT(fc) x ^ 

> -n 2k 

L* ~4 

for any k > 2. 

Thus, the above results indicate that the worst-case performances of k— opt 
heuristics are quite poor. By contrast, many researchers and practitioners have 
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reported that k — opt heuristics can be highly effective; see, for instance, Golden 
and Stewart (1985). 

This raises a fundamental dilemma. Although worst-case analysis provides a 
rigid guarantee on a heuristic’s performance, it suffers from being highly deter- 
mined by certain pathological examples. Is there a more appropriate measure to 
assess the effectiveness of a particular heuristic, one that would assess the effec- 
tiveness on an average or realistic example? We will try to address this question 
in the next chapter. 



3.4 Exercises 



Exercise 3.1. Prove Lemma 3.3.8. 

Exercise 3.2. The 2-TSP is the problem of designing two tours that together visit 
each of the customers and use the same starting point. Show that any algorithm 
for the TSP can solve this problem as well. 

Exercise 3.3. (Papadimitriou and Steiglitz, 1982) Consider the n— city TSP in 
which the triangle inequality assumption holds. Let c* > 0 be the length of an 

optimal tour, and let d be the length of the second best tour. Prove: (d — c*)/c* < 
2 

n ’ 

Exercise 3.4. Prove that in every completely connected directed graph (a graph 
in which between every pair of vertices there is a directed edge in one of the two 
possible directions) there is a directed Hamiltonian Path. 

Exercise 3.5. Let Z G be the length of the tour provided by Christofides’ Heuristic, 
and let Z* be the length of the optimal tour. Construct an example with Z c = 
§Z*. 

Exercise 3.6. Prove that for any graph G there exists an even number of nodes 
with odd degree. 

Exercise 3.7. Let G be a tree with n > 2 nodes. Show that: 

(a) There exist at least two nodes with degree one. 

(b) The number of arcs is n — 1. 

Exercise 3.8. Consider the n— city TSP defined with distances dij. Assume that 
there exist a, b £ ]R n such that for each i and j, d tJ = ai + bj. What is the length 
of the optimal traveling salesman tour? Explain your solution. 

Exercise 3.9. Consider the TSP with the triangle inequality assumption and 
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two prespecified nodes s and t. Assume that the traveling salesman tour has to 
include edge (s, t) (that is, the salesman has to travel from s directly to t). Modify 
Christofides’ Heuristic for this model and show that the worst-case bound is |. 

Exercise 3.10. Show that a minimum spanning tree T satisfies the following 
property. When T is compared with any other spanning tree T' , the k th shortest 
edge of T is no longer than the k th shortest edge of T', for k = 1, 2, . . . , n — 1. 

Exercise 3.11. (Papadimitriou and Steiglitz, 1982) The Wandering Salesman 
Problem (WSP) is a Traveling Salesman Problem except that the salesman can 
start wherever he or she wishes and does not have to return to the starting city 
after visiting all cities. 

(a) Describe a heuristic for the WSP with worst-case bound |. 

(b ) Show that the same bound can be obtained for the problem when one of the 
end-points of the path is specified in advance. 



Exercise 3.12. (Papadimitriou and Steiglitz, 1982) Which of the following prob- 
lems remain essentially unchanged (complexity-wise) when they are transformed 
from minimization to maximization problems? Why? 

(a) Traveling Salesman Problem. 

(b) Shortest Path from s to t. 

(c) Minimum Weight Matching. 

(d) Minimum Spanning Tree. 



Exercise 3.13. Suppose there are n jobs that require processing on m machines. 
Each job must be processed by machine 1, then by machine 2, . . . , and finally 
by machine m. Each machine can work on at most one job at a time and once it 
begins work on a job it must work on it until completion, without interruption. 
The amount of time machine j must process job i is denoted pjj > 0 (for i = 
1,2 , ... ,n and j = 1,2 , . . . , to). Further suppose that once the processing of a job 
is completed on machine j, its processing must begin immediately on machine j + 1 
(for j < to — 1). This is a flow shop with no wait-in-process. 

Show that the problem of sequencing the jobs so that the last job is completed 
as early as possible can be formulated as an (n + l)-city TSP. Specifically, show 
how the dij values for the TSP can be expressed in terms of the p l:j values. 

Exercise 3.14. Consider the Bin-Packing Problem with items of size vj{, i = 
1,2, ... ,n, such that 0 < w t < 1. The objective is find the minimum number 
of unit size bins b* needed to pack all the items without violating the capacity 
constraint. 
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(a) Show that w > a lower bound on b*. 

(b) Define a locally optimal solution to be one where no two bins can be feasibly 
combined into one. Show that any locally optimal solution uses no more than 
twice the minimum number of bins, that is, no more than 2b* bins. 

(c) The Next-Fit Heuristic is the following. Start by packing the first item in bin 
1. Then, each subsequent item is packed in the last opened bin if possible, 
or else a new bin is opened and it is placed there. Show that the Next-Fit 
Heuristic produces a solution with at most 2b* bins. 



Exercise 3.15. (Anily et al., 1994) Consider the Bin-Packing Problem and the 
Next-Fit Increasing heuristic. In this strategy items are ordered in a nondecreasing 
order according to their size. Start by packing the first item in bin 1. Then each 
subsequent item is packed in the last opened bin if possible, or else a new bin is 
opened and it is placed there. Show that the number of bins produced by this 
strategy is no more than | times the optimal number of bins. For this purpose, 
consider the following two steps. 

(a) Consider the following procedure. First order the items in nondecreasing 
order of their size. When packing bin i > 1, follow the packing rule: if the 
bin is currently feasible (i.e., total load is no more than 1), then assign the 
next item to this bin; otherwise, close this bin, open bin i + 1 and put this 
item in bin i + 1. Show that the number of bins generated by this procedure 
is a lower bound on the minimal number of bins needed. 

( b ) Relate this lower bounding procedure to the number of bins produced by the 
Next-Fit Increasing heuristic. 



Exercise 3.16. Given a network G = (V,E), and edge length l e for every e £ E, 
assume that edge (it, v) has a variable length x. Find an expression for the length 
of the shortest path from s to t as a function of x. 

Exercise 3.17. A complete directed network G = (V, A) is a directed graph such 
that for every pair of vertices u,v £ V, there are arcs u — > v and v — > u in A with 
nonnegative arc lengths d(u,v) and d(v,u), respectively. The network G = (V, A) 
satisfies the triangle inequality if for all u,v,w £ V, d(u, v ) + d(v, w ) > d(u, w). 

A directed cycle is a sequence of vertices v\ — > V 2 — * ► • • • — > vt — > v\ without 
any repeated vertex other than the first and last ones. If the cycle contains all the 
vertices in G, then it is said to be a directed Hamiltonian cycle. To keep notation 
simple, let dij = d(vi,Vj). 

A directed cycle containing exactly k vertices is called a k— cycle. The length of 
a cycle is defined as the sum of arc lengths used in the cycle. A directed network 
G = (V,A) with \V\ > k is said to be k— symmetric if for every k— cycle v\ — > 
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V2 — > • • • — > Vk — > Vi in G, 

di2 + c?23 + • • • + dfe-l,fc + rffcl = c?lfc + dk,k - 1 + • • • + G?32 + ^21- 

In other words, a fc— symmetric network is a directed network in which the length 
of every k— cycle remains unchanged if its orientation is reversed. 

(a) Show that the asymmetric Traveling Salesman Problem on a |V|— symmetric 
network (satisfying the triangle inequality) can be solved via solving a correspond- 
ing symmetric Traveling Salesman Problem. In particular, show that any heuristic 
with fixed worst-case bound for the symmetric Traveling Salesman Problem can be 
used for the asymmetric Traveling Salesman Problem on a V | —symmetric network 
to obtain a result with the same worst-case bound. 

( b ) Prove that any 3-symmetric network is k— symmetric for k = 4, 5, ... , \ V\. 
Thus part (a) can be used if we have a 3-symmetric network. Argue that a 

3-symmetric network can be identified in polynomial time. 




4 

Average-Case Analysis 



4.1 Introduction 

Worst-case performance analysis is one method of characterizing the effectiveness 
of a heuristic. It provides a guarantee on the maximum relative difference between 
the solution generated by the heuristic and the optimal solution for any possible 
problem instance, even those that are not likely to appear in practice. Thus, a 
heuristic that works well in practice may have a weak worst-case performance, 
if, for example, it provides very bad solutions for one (or more) pathological in- 
stance (s). 

To overcome this important drawback, researchers have recently focused on 
probabilistic analysis of algorithms with the objective of characterizing the aver- 
age performance of a heuristic under specific assumptions on the distribution of 
the problem data. As pointed out, for example, by Coffman and Lueker (1991), 
probabilistic analysis is frequently quite difficult and even the analysis of simple 
heuristics can often present a substantial challenge. Therefore, usually the anal- 
ysis is asymptotic. That is, the average performance of a heuristic can only be 
quantified when the problem size is extremely large. 

As we demonstrate in Parts II and IV, an asymptotic probabilistic analysis is 
useful for several reasons: 

1. It can foster new insights into which algorithmic approaches will be effective 
for solving large size problems. That is, the analysis provides a framework 
where one can analyze and compare the performance of heuristics on large 
size problems. 
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2. For problems with fast rates of convergence, the analysis can sometimes 
explain the observed empirical behavior of heuristics for more reasonable 
size problems. 

3. The approximations derived from the analysis can be used in other models 
and may lead to a better understanding of the tradeoffs in more complex 
problems integrating vehicle routing with other issues important to the firm, 
such as inventory control. 

In this chapter we present some of the basic tools used in the analysis of the 
average performance of heuristics. Again we use the Bin-Packing Problem and the 
Traveling Salesman Problem as the “raw materials” on which to present them. 



4.2 The Bin-Packing Problem 

The Bin-Packing Problem provides a very well studied example for which to 
demonstrate the benefits of a probabilistic analysis. 

Without loss of generality, we scale the bin capacity q so that it is 1. Consider 
the item sizes w\, u> 2 , W 3 . . . to be independently and identically distributed on 
(0, 1] according to some general distribution <I>. In this section we demonstrate two 
elegant and powerful techniques that can be used in the analysis of 6* , the random 
variable representing the optimal solution value on the items W\,W 2 , ■ ■ ■ ,w n . The 
first is the theory of subadditive processes and the second is the theory of martingale 
inequalities. 

Subadditive Processes 

Let {a„}, n > 1, be a sequence of positive real numbers. We say that the 
sequence is subadditive if for all n and to we have a n + a m > a n + m . The following 
important result was proved by Kingman (1976) and Steele (1990) whose proof we 
follow. 

Theorem 4.2.1 If the sequence {a n }, n > 1 is subadditive, then there exists a 
constant 7 such that 

i • Q"n 

lim — = 7. 

n—> 00 Jl 

Proof. Let 7 = lim^ °' n . For a given e select n such that — < 7 + e. Since the 
sequence {a„} is subadditive we have 

^ nm — ^ n T ^n(m— 1) • 

Making a repeated use of this inequality we get a nm < ma n which implies 

® nm ^ 

< 7 + e. 



nm 
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For any k, 0 < k < n, define l = nm + k. Using subadditivity again, we have 

&C = drim+k — dnm+k— 1 ~b 0*1 
< a nm + kai 
— ®nm T nxl\ 



where the second inequality is obtained by repeating the first one k times. Thus, 



t 



® nm+k ^ Clnm 



■ no, i ^ o nrn 



nm. + k nm + k nm. 
Taking the limit with respect to to we have 



ai , . ai 

— < 7 + e H • 

TO TO 



n — a l t - — a i 

lim — < 7 + e + lim — = 7 + e. 

C — >00 t m — >oo m 



The proof is therefore complete since e was chosen arbitrarily. I 

It is clear that the optimal solution of the Bin-Packing Problem possesses a 
subadditivity-like property; that is, for any sets S,T C N: 



b*(S U T) < b*(S) + b*(T), 



where b*(S) denotes the optimal solution to the bin-packing problem on a set 
S C N. Using similar arguments as in the above analysis shows that there exists a 
constant 7 such that the optimal solution to the Bin-Packing Problem 6* satisfies 

lim — = 7 (a.s.). 

n — kx> 77, 

In addition, 7 is dependent only on the item size distribution $. 



The Uniform Model 



To illustrate the concepts just developed, consider the case where $ is the uni- 
form distribution on [0, 1]. In order to pack a set of n items drawn randomly from 
this distribution, we use the following Sliced Interval Partitioning heuristic with 
parameter r ( SIP{r )). It works as follows. For any fixed, positive integer r > 1, the 
set of items N is partitioned into the following 2 r disjoint subsets, some of which 
may be empty: 






iVi = {fee Af|-(l-^) <w fe <-(l-^)} j = 1 , 2 , . . . ,r - 1 , 



and 



Nj = {k £ N \-(l + <w fc <-(l + ^)} j = l, 2 ,...,r-l. 

JV 0 = {fceiv||(l — 



Also 
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and 

W-' = {fc£ JV |l( 1 + ^) < «, 1 | 

The number of items in each Nj (respectively, N 3 ) is denoted by rij (respectively, 
n 3 ) for all possible values of j. 

Note that for any j = 1, 2, . . . , r — 1, one bin can hold an item from Nj together 
with exactly one item from N 3 . The SIP{r ) heuristic generates pairs of items, one 
item from Nj and one from N 3 , for every j = 1,2,. . . , r — 1. The items in Nq U N r 
are put in individual bins; one bin is assigned to each of these items. 

For any j = 1, 2, . . . , r — 1, we arbitrarily match one item from Nj with exactly 
one item from N 3 ' one bin holds each such pair. If nj = n 3 , then all the items in 
Nj U N 3 are matched. If, however, rij ^ n 3 , then we can match exactly min {nj,n 3 } 
pairs of items. The remaining rij — n 3 \ items in Nj U N 3 that have not yet been 
matched are put one per bin. Thus, the total number of bins used is 

r— 1 

n 0 + n r + max{rij, n 3 }. 
j = i 

The heuristic clearly generates a feasible solution to the Bin-Packing Problem. 
Since 

? 7 < ■ Tip 1 

lim — = lim — = — (o.s.) for all j = 1 , 2 , . . . , r, 

n — ►oo 71 n — >-oo 71 2 T 

we have 

b* 1 ^ ^ "ill 

7= lim — < lim — no + n r + > maxim, n 3 } = - + — ( a.s .). 

n — >00 71 n — >00 71 ^ ' J 2 2r 

3 = 1 

Since this holds for any r > 1, we see that 7 < 1. Since 7 > E(w) (see Exercise 
4.4), then 7 > 1 and we conclude that 7 = | for the uniform distribution on [0, 1]. 

Using this idea, we can actually devise an asymptotically optimal heuristic for 
instances where the item sizes are uniformly distributed on [0,1]. To formally 
define this property, let Z* be the cost of the optimal solution to the problem on 
a problem of size n, and let be the cost of the solution provided by a heuristic 
H. Let the relative error of a heuristic H on a particular instance of n points be 




Definition 4.2.2 Let T be a probability measure on the set of instances X. A 
heuristic H is asymptotically optimal for T if almost surely 

lim = 0, 

n—> 00 

where the problem data are generated randomly from T. 
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That is, under certain assumptions on the distribution of the data, H generates 
solutions whose relative error tends to zero as n, the number of points, tends to 
infinity. The above SIP(r) heuristic is not asymptotically optimal since for any 
fixed r, the relative error converges to 

A truly asymptotically optimal heuristic can easily be constructed. The following 
heuristic is called MATCH. First, sort the items in nonincreasing order of the item 
sizes. Then take the largest item, say item i, and match it with the largest item 
with which it will fit. If no such item exists, then put item i in a bin alone. 
Otherwise, put item i and the item it was matched with in a bin together. Now 
repeat this until all items are packed. The proof of asymptotic optimality is given 
as an exercise (Exercise 4.11). 

An additional use for the bin-packing constant 7 is as an approximation for 
the number of bins needed. When n is large, the number of bins required to 
pack n random items from <f> is very close to 717 . How close the random variable 
representing the number of bins is to 717 is discussed next. 



Martingale Inequalities 

Consider the stochastic processes {X n } and {Y n } with n > 0. We say that the 
stochastic process { X n } is a martingale with respect to {Y n } if for every n > 0 we 
have 

(i) E[X n ] < + 00 , and 
(«) E[X n+1 \Y 1 ,..,,Y n \=X n . 

To get some insight into the definition of a martingale consider someone playing 
a sequence of fair games. Let X n = Y n be the amount of money the player has at 
the end of the n th game. If {X n } is a martingale with respect to {Y n }, then this 
says that the expected amount of money the player will have at the end of the 
(n + l) s ^ game is equal to what the player had at the beginning of that game X n , 
regardless of the game’s history prior to state n. See Karlin and Taylor (1975) for 
details. 

Consider now the random variable: 



D n = E[X n+1 \Y u . . . ,Y n ] - E[X n+1 \Y u . . . ,Y n _ 1 ]. 



The sequence { D n } is called a martingale difference sequence if E[D n \ = 0 for every 
n > 0. Azuma (1967) developed the following interesting inequality for martingale 
difference sequences; see also Stout (1974) or Rlree and Talagrancl (1987). 

Lemma 4.2.3 Let {D{\, i = 1, 2, . . . , n be a martingale difference sequence. Then 
for every t > 0 we have 



Pr { |Ea 

i<.n 



> t] < 2 exp | HAll^o)}, 

£<n 



where ||.D,:||oo is a uniform upper bound on the Di ’s. 
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The Lemma can be used to establish upper bounds on the probable deviations 
of both 



• 6* from its mean P[6*], and 

• — from its asymptotic value 7. 

For this purpose, define 

D _ f E[b*\w 1 ; ...,Wi\ - E[b* n \w 1 ,...,w i - 1 \, if* >2; 
^{EKlwJ-Elbim, if * = 1. 

where E[b^\wi, . . . , wf\ is the random variable that represents the expected optimal 
solution value of the Bin-Packing Problem obtained by fixing the sizes of the first 
i items and averaging on all other item sizes. Clearly, E[b^\wi, , w n \ = b* n while 
E[b* n |0] = £?[&*]. Hence, Xu=i ^ = b* n — E[b^\. Furthermore, the sequence £>* 
defines a martingale difference sequence with the property that Dj < 1 for every 
i > 1. 

Applying Lemma 4.2.3 we obtain the following upper bound. 



Pr{\b* n -E[b* n ]\>t} 



Pr{ |EA 

i= 1 



> < 2 exp | — f 2 /(2?*)|. 



This bound can now be used to construct an upper bound on the likelihood that 
differs from its asymptotic value by more than some fixed amount. 

Theorem 4.2.4 For every e > 0 there exists an integer no such that for all n > 
n 0 , 



Pr\ 



>e } < 2 exp (~ir)- 



Proof. Theorem 4.2.1 implies that lim„_ >00 E[—] = 7 and therefore for every e > 0 
and k > 2 there exists no such that for all n > Uq we have 



E 



r b* 



l n 



1 



e 

< k' 



Consequently, 



Pr{ 



n 



> e 



K_m 

n n 
K E[b* n 



n 



n 



| < 

< Pr| 

<Pr{\b* n -E[b* n ] 

< 2 exp | — 



EK] 

n 



> e 



> 



ne 



(■ k - 1 ) 



ne 2 (k — l) s 

2 k 2 



}■ 
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Since this last inequality holds for arbitrary k > 2, this completes the proof. I 
These results demonstrate that &* is in fact very close to n-y, and this is true 
for any distribution of the item sizes. Therefore, it suggests that ny may serve as 
a good approximation for 6* in other, more complex, combinatorial problems. 



4.3 The Traveling Salesman Problem 

In this section we demonstrate an important use for the tools presented above. Our 
objective is to show how probabilistic analysis can be used to construct effective 
algorithms with certain attractive theoretical properties. 

Let X\,X 2 , ... ,x n be a sequence of points in the Euclidean plane (1R 2 ) and let 
L* be the length of the optimal traveling salesman tour through these n points. 
We start with a deterministic upper bound on L* developed by Few (1955). We 
follow Jaillet’s (1985) presentation. 

Theorem 4.3.1 Let a x b be the size of the smallest rectangle that contains 
x\, X 2 ■ ■ ■ , x n , then 

L* n < sj2{n - 2 )ab + 2 (a + h). 

Proof. For an integer m (to be determined), partition the rectangle of size a x 
b (where a is the length and b is the height) into 2 m equal width horizontal 
strips. This creates 2m + 1 horizontal lines and two vertical lines (counting the 
boundaries of the rectangle). Label the horizontal lines 1, 2, ... , 2m + 1 moving 
downwards. Now temporarily delete all horizontal lines with an even label. Connect 
each point Xi, i = 1,2 ...,n, with two vertical segments, to the closest (odd- 
labeled) horizontal line. A path through xi,...,x n can now be constructed by 
proceeding from, say the upper left-hand corner of the a x b rectangle and moving 
from left to right on the first horizontal line picking up all points that are connected 
(with the two vertical segments) to this line. Then we proceed downwards and 
cover the third horizontal line from right to left. This continues until we reach 
the end of the 2m + 1 st line. This path can be extended to a traveling salesman 
tour by returning from the last point to the first by adding at most one vertical 
and one horizontal line (we avoid diagonal movements for the sake of simplicity). 
Now repeat this procedure with the even labeled horizontal lines and, in a similar 
manner, create a path through all the customers. Extend this path to a traveling 
salesman tour by adding one horizontal line and one vertical segment of length 
b — T See Figure 4.1. 

Clearly, the sum of length of the two traveling salesman tours is 

a(2m T 1) T — 2b a 2 ( b — — . 

m V m/ 

Since L* is no larger than either of these two tours, we have 
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FIGURE 4.1. The two traveling salesman tours constructed by the partitioning algorithm. 



The right-hand side is convex in in ; hence, we minimize on m. That is, we choose: 



m = 



b(n — 2) 



2a 



then 



L* < a + 2b + m* a + — — — 
“ 2m* 



^ CL H - 2 b H - CL 



(/ 



b(n — 2) 
2a 



+ 1 + 



b(n — 2) 2 a 



(n — 2)6 



— \J 2{ji — 2)a6 4- 2(n -(- 6) . 



The above result implies that the length of the optimal traveling salesman tour 
is at most 0(y / n). In 1959, Beardwood et al. showed that the rate of growth of L * , 
when customer locations are independent and identically distributed, is Q(^/n). 
Specifically, they prove the following result. 

Theorem 4.3.2 Let xi,x%, . . . ,x n be a sequence of independent random variables 
having a distribution p with compact support in M 2 . Then there exists a constant 
(3 > 0, independent of the distribution \x, such that with, probability one, 

lim — (3 [ f 1 ^ 2 (x)dx, 

n^oo sjn ' J M 2 

where f is the density of the absolutely continuous part of the distribution p. 

Since Beardwood et al. proved this result many researchers have proved it using 
a variety of techniques. One of these methods is based on the concept of Euclidean 
subadditive processes (Steele, 1981) which is a generalization of the concept of 
subadditive processes described earlier. 
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FIGURE 4.2. Region partitioning example with n = 17, q = 3, h = 2 and t = 1. 

In this subsection we are not going to prove the result, but rather concentrate on 
its algorithmic implications. Specifically, we will describe the following polynomial 
time algorithm which is asymptotically optimal. The heuristic was suggested by 
Karp (1977), although we have modified it in several places for the purpose of 
clarifying the presentation. 

A Region Partitioning Heuristic 

In the Region Partitioning heuristic, the region containing the points is sub- 
divided into subregions such that each non-empty subregion contains exactly q 
customers (except possibly for one) and where q is to be determined later. The 
heuristic then constructs an optimal traveling salesman tour on the set of points 
within or bordering each subregion and then connects these tours to form a trav- 
eling salesman tour through all the points. 

To generate subregions each with exactly q points, except for possibly one sub- 
region where there may be fewer points, we use the following strategy: the smallest 
rectangle with sides a and b containing the set of points x\, # 2 , • • • , x n is partitioned 
by means of horizontal and vertical lines. First, the region is divided by t vertical 
lines such that each subregion contains exactly ( h + l)g points except possibly 
the last one. This is done precisely as follows: temporarily index the customers in 
increasing order of their horizontal coordinate. Place the vertical lines so that the 
jth ver ti ca i li n e (for j < t) goes through the customer with index j(h + l)g. Each 
of these t - 1-1 subregions is then partitioned by means of h horizontal lines into 
h + 1 smaller subregions such that each contains exactly q points except possibly 
the last one. More precisely, this is done as follows: in each vertical strip index the 
customers in increasing order of their vertical coordinates. Place the horizontal 
lines so that the j th horizontal line (for j < h ) goes through the customer with 
index jq. See Figure 4.2 for an example. 

To solve the Traveling Salesman Problems within each subregion, we use a dy- 
namic programming algorithm developed by Held and Karp (1962). It finds an op- 
timal traveling salesman tour through to points in running time which is 0(m 2 2 m ). 
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FIGURE 4.3. The tour generated by the region partitioning algorithm. 

If we choose q = [log n\ then solving the Traveling Salesman Problem for a sin- 
gle region takes 0(nlog 2 ?i), and since the number of subregions is no more than 
1 + n/logn, the total time spent solving these Traveling Salesman Problems is 
0(n 2 logn). 

After finding optimal traveling salesman tours within each subregion, observe 
that this collection of traveling salesman tours can be easily transformed into 
a single traveling salesman tour through all the points. This is true since this 
collection of tours along with the lines added as above, defines an Eulerian graph 
where the degree of each point (node) is either two or four (a point that is on 
the boundary of two subregions will have degree four). Thus, this tour can be 
transformed into a single traveling salesman tour, and using shortcuts its length 
can be further reduced. See Figure 4.3. 

To guarantee that each non-empty subregion has exactly q points, except for 
maybe one, h and t must satisfy 



t = 



{h+l)q 



and 

t(h + 1 )q < n < (t+ l)(/i + 1 )q. 



This is achieved by choosing h = [. fij- — 1] . 



Let L rp be the length of the tour generated by the above Region Partitioning 
heuristic. To establish the quality of the heuristic we need to find an upper bound 
on L rp ; this is provided by the following. 



Lemma 4.3.3 

L rp < L* + [|p RP , 

where P RP is the sum of the perimeters of all subregions generated by the Region 
Partitioning heuristic. 
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FIGURE 4 . 4 . The segments Si, . . . , Sk and the corresponding Eulerian graph. 

Proof. Let Lj be the length of the optimal traveling salesman tour in subregion 
j = 1 , 2 ,.. . , Similarly, let L* be the sum of the lengths of all segments of 
the optimal traveling salesman tour through all n customers that are contained 
in the j th subregion, for j > 1 . Since the collection of tours and lines constructed 
above defines an Eulerian graph, we have L RP < L., . Also, by definition we 

have L* = JT L* . Thus, it is sufficient to show that 

LjKL'+^Pj, ( 4 . 1 ) 

where Pj is the perimeter of subregion j . 

To prove inequality ( 4 . 1 ), assume there are exactly k continuous segments Si,..., Sk, 
of the globally optimal traveling salesman tour, in subregion j; see Figure 4 . 4 . Let 
the 2 k end-points of these segments be yi, 2/2, ■ ■ ■ , yik ordered consecutively around 
the boundary of subregion j. Without loss of generality we assume that 

KyiVz) + ^(2/32/4) H b t{y2kV2k- 1) < ^(2/22/3) + ^(2/42/5) H b f{y2kyi), 

where £(yiUi+i) is the distance between point y,; and y l+ ] along the perimeter 
of the j th subregion. We construct a feasible solution for the Traveling Salesman 
Problem defined by the points x, that are in the j th subregion. The tour is based 
on ( i ) the segments Si,...,Sk', ( ii ) two copies of each segment 2/12/2, 2/32/4, 
2 / 2 fc-i 2 / 2 fc; and (Hi) one copy of each segment 2/22/3, 2 / 42 / 5 , • • • , 2 / 2 fe 2 /i- 

Observe (Figure 4 . 4 ) that the above three components define an Eulerian graph 
whose set of vertices is the points x, that belong to the j th subregion plus all the 
points yi, for i = 1 , 2 ,..., 2 k. This implies that the graph has an Eulerian tour 
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whose cost is no more than 



l ; + h- 



This tour can be converted into a traveling salesman tour, using shortcuts, and 
therefore 

Lj < L* + -Pj. 

Summing these up on j completes the proof. I 

We can now prove the following result due to Karp. 

Theorem 4.3.4 Under the conditions of Theorem 4-3.2, with probability one, 

L* L rp 

lim —= = Inn — . 

n—> oo \/Tl n—> oo WTl 

Proof. Lemma 4.3.3 implies 



L* < L rp < L* 



-P 



RP 



Hence, we need to evaluate the quantity P RP . Note that the number of vertical 
lines added in the construction of the subregions is t < ypj- Each of these lines is 
counted twice in the quantity P RP . 

In the second step of the RP heuristic we add h horizontal lines where h < 
These horizontal lines are also counted twice in P RP . It follows that 



P RP < 2j-(a + b) +2(a + b) < 2 



n 

logn 



(a + 6) + 2(a T b ), 



where the right-hand side inequality is justified by the definition of q. 
Consequently, 

L RP j * 3 pRP 

; 



< 



\Jn 2 vn 

L* 3 (a + b) 3 (a + b) 



\fn. \/\o% n \fn 
Taking the limit as n goes to infinity proves the theorem. 



4.4 Exercises 



Exercise 4.1 .A lower bound on f3. Let X (n) = {x\,x%, . . . ,x n } be a set of points 
uniformly and independently distributed in the unit square. Let t 3 be the distance 
from Xj € X(n) to the nearest point in X(n) \ Xj. Let L(X(n)) be the length of 
the optimal traveling salesman tour through X(n). Clearly E(L(X(n))) > nE(£ i). 
We evaluate a lower bound on /3 in the following way. 




4.4 Exercises 
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(a) Find Pr(£i > t). 

( b ) Use (a) to calculate a lower bound on E(£ i) = J 0 °° Pr(^! > £)dJL. 

(c) Use Stirling’s formula to approximate the bound when n is large. 

( d ) Show that \ is a lower bound on (3. 

Exercise 4.2. An upper bound on (3 . (Karp and Steele, 1985) The strips method 
for constructing a tour through n random points in the unit square dissects the 
square into horizontal strips of width A, and then follows a zigzag path, visiting 
the points in the first strip in left-to-right order, then the points in the second strip 
in right-to-left order, etc., finally returning to the initial point from the final point 
of the last strip. Prove that, when A is suitably chosen, the expected length of the 
tour produced by the strips method is at most 1.16^/n. 

Exercise 4.3. Consider the TSP defined on a set of points N indexed 1,2, ... , n. 
Let Z* be the length of the optimal tour. Consider now the following strategy: 
starting with point 1, the salesman moves to the closest point in the set N \ {1}, 
say point 2. The salesman then constructs an optimal traveling salesman tour 
defined on this set of n — 1 points ( N \ {1}) and then returns to point 1 through 
point 2. Show that the length of this tour is no larger then 3Z* /2. Is the bound 
tight? 

Exercise 4.4. Prove that the bin-packing constant 7 satisfies 1 < 7 /E{w) < 2 
where E(w) is the expected item size. 

Exercise 4.5. The Harmonic heuristic with parameter M, denoted H(M), is the 
following. For each k = 1,2, . . . ,M — 1, items of size Uy- < vj, < are packed 
separately, at most k items per bin. That is, items of size greater than | are packed 
one per bin, items of size y < w, < \ are packed two per bin, etc. Finally, items 
of size Wi < are packed separately from the rest using First-Fit. 

Given n items drawn randomly from the uniform distribution on ( g , 0] , what is 
the asymptotic number of bins used by 14(5)? 

Exercise 4.6. Suggest a method to pack n items drawn randomly from the uniform 
distribution on [|, 1], Can you prove that your method is asymptotically optimal? 
What is the bin-packing constant (7) for this distribution? 

Exercise 4.7. Suggest a method to pack n items drawn randomly from the uniform 
distribution on [0, 12 ]. Can you prove that your method is asymptotically optimal? 
What is the bin-packing constant (7) for this distribution? 

Exercise 4.8. Suggest a method to pack n items drawn randomly from the uni- 
form distribution on ^]- Can you prove that your method is asymptotically 
optimal? What is the bin-packing constant (7) for this distribution? 
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Exercise 4.9. (Dreyfus and Law, 1977) The following is a dynamic programming 
procedure to solve the TSP. Let city 1 be an arbitrary city. Define the following 
function. 



fi (j, S) = the length of the shortest path from city 1 to 

city j visiting cities in the set S, where |S| = i. 

Determine the recursive formula and solve the following instance. 



The distances between cities. 
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Exercise 4.10. What is the complexity of the dynamic program developed in the 
previous exercise? 

Exercise 4.11. (Coffman and Leuker, 1991) Consider flipping a fair coin n times 
in succession. Let X n represent the random variable denoting the maximum excess 
of the number of heads over tails at any point in the sequence of n flips. It is known 
that E{X n ) is Q(y/n). From this, argue that 

^ MATCH ] = 5 +©(^). 



Exercise 4.12. Assume n cities are uniformly distributed in the unit disc. Con- 
sider the following heuristic for the n-city TSP. Let di be the distance from city i 
to the depot. Order the points so that d\ < d 2 < • ■ • < d n . For each i = 1, 2, . . . , n, 
draw a circle of radius di centered at the depot; call this circle i. Starting at the 
depot travel directly to city 1. From city 1 travel to circle 2 in a direction along the 
ray through city 1 and the depot. When circle 2 is reached, follow circle 2 in the 
direction (clockwise or counterclockwise) that results in a shorter route to city 2. 
Repeat this same step until city n is reached; then return to the depot. Let Z^ be 
the length of this traveling salesman tour. What is the asymptotic rate of growth 
of Z^'l Is this heuristic asymptotically optimal? 
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Mathematical Programming Based 
Bounds 



5.1 Introduction 

An important method of assessing the effectiveness of any heuristic is to compare 
it to the value of a lower bound on the cost of an optimal solution. In many 
cases this is not an easy task; constructing strong lower bounds on the optimal 
solution may be as difficult as solving the problem. An attractive approach for 
generating a lower bound on the optimal solution to an AC^-Complete problem is 
the following mathematical programming approach. First, formulate the problem 
as an integer program; then relax the integrality constraint and solve the resulting 
linear program. 

What problems do we encounter when we try to use this approach? One diffi- 
culty is deciding on a integer programming formulation. There are myriad possible 
formulations from which to choose. Another difficulty may be that in order to for- 
mulate the problem as an integer program, a large (sometimes exponential) number 
of variables are required. That is, the resulting linear program may be very large, 
so that it is not possible to use standard linear programming solvers. The third 
problem is that it is not clear how tight the lower bound provided by the linear 
relaxation will be. This depends on the problem and the formulation. 

In the sections below we demonstrate how a general class of formulations can 
provide tight lower bounds on the original integer program. In later chapters we 
show that these and similar linear programs can be solved effectively and imple- 
mented in algorithms that solve logistics problems to optimality or near optimality. 
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5.2 An Asymptotically Tight Linear Program 



Again, consider the Bin-Packing Problem. There are many ways to formulate the 
problem as an integer program. The one we use here is based on formulating it as 
a Set-Partitioning Problem. The idea is as follows. Let F be the collection of all 
sets of items that can be feasibly packed into one bin; that is, 



F={S CN :^2wi< 1 }. 
ies 



For any i £ N and S £ F, let 



f 1, if * G S', 

\ 0, otherwise. 



Let 

( 1, if the set of items S are placed in a single bin, 

S \ 0, otherwise. 

Then the set-partitioning formulation of the Bin-Packing Problem is the following 
integer program. 



Problem P : Min E ys 

sgf 

s.t. 

E a isys = 1, Vi £ N (5.1) 

sgf 

ys e {0,1}, VS£F. 

In this section we prove that the relative difference between the optimal solution 
of the linear relaxation of problem P and the optimal solution of problem P (the 
integer solution) tends to zero as \N\ = n, the number of items, increases. First 
we need the following definition. 

Definition 5.2.1 A function </> is Lipschitz continuous of order q on a set ACM 
if there exists a constant K such that 

\<t>{x) - <f>(y) | < K\x - y\ q , \/x,y £ A. 

Our first result of this section is the following. 

Theorem 5.2.2 Let the item sizes be independently and identically distributed 
according to a distribution d> which is Lipschitz continuous of order q > 1 on 
[0,1]. Let 6 PP be the value of the optimal solution to the linear relaxation of P, 
and let 6* be the value of the optimal integer solution to P; that is, the value of 
the optimal solution to the Bin-Packing Problem. Then, with probability one, 

lim — b]f = lim —6*. 

n — >-oo fl n — »oo 77, 
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To prove the theorem we consider a related model. Consider a discretized Bin- 
Packing Problem in which there are a finite number W of item sizes. Each dif- 
ferent size defines an item type. Let rij be the number of items of type i, for 
i = 1,2, ... ,W, and let n = Y^iL\ n i be the total number of items. Clearly, 
this discretized Bin-Packing Problem can be solved by formulating it as the set- 
partitioning problem P. To obtain some intuition about the linear relaxation of 
P , we first introduce another formulation closely related to P. 

Let a bin assignment be a vector ( 01 , 02 , . . . , aw), where a, > 0 are integers, and 
such that a single bin can contain ai items of type 1, along with 02 items of type 
2, . . . , along with aw items of size W , without violating the capacity constraint. 
Index all the possible bin assignments 1,2,..., .R, and note that R is independent 
of n. The Bin-Packing Problem can be formulated as follows. Let 

Ai r = number of items of type i in bin assignment r, 
for each i = 1, 2, . . . , W and r = 1, 2, . . . , R. Let 

y r = number of times bin assignment r is used in the optimal solution. 

The new formulation of the discretized Bin-Packing Problem is: 

R 

Problem Pd : Min ^ y r 

r= 1 

S.t. 

R 

'y ' VrAir m, Vi 1,2,..., W, 

r=l 

y r > 0 and integer, Vr = 1, 2, . . . , R. 

Let b* D be the value of the optimal solution to Problem Pd and let bjf be the 
optimal solution to the linear relaxation of Problem Pd- Clearly, Problem P and 
Problem Pd have the same optimal solution values; that is, b* = b* D . On the other 
hand, 6 LP is not necessarily equal to 6^ p . However, it is easy to see that any feasible 
solution to the linear relaxation of Problem P can be used to construct a feasible 
solution to the linear relaxation of Problem Pd and therefore, 

fe LP > b^. (5.2) 

The following is the crucial lemma needed to prove Theorem 5.2.2. 

Lemma 5.2.3 

6 lp <b* <b^ + W < b LP + W. 

Proof. The left-most inequality is trivial while the right-most inequality is due to 
equation (5.2). To prove the central inequality note that in Problem Pd there are 
W constraints, one for each item type. Let y r , for r = 1, 2, . . . , R, be an optimal 
solution to the linear relaxation of Problem Pd and observe that there exists such 
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an optimal solution with at most W positive variables; one for each constraint. 
We construct a feasible solution to Problem Pp by rounding the linear solution 
up; that is, for each r = 1,2 ,R with y r > 0 we make y r = \y r ~\ and for each 
r = 1, 2, . . . , R with y r = 0 we make y r = 0. Hence, the increase in the objective 
function is no more than W. I 

Observe that the upper bound on b* obtained in Lemma 5.2.3 consists of two 
terms. The first, & LP , is a lower bound on b* , which clearly grows with the number 
of items n. The second term (W) is independent of n. Therefore, the upper bound 
on b* of Lemma 5.2.3 is dominated by & LP and consequently we see that for large 
n, b* « 6 lp , exactly what is implied by Theorem 5.2.2. 

We can now use the intuition developed in the above analysis of the discrete 
Bin-Packing Problem to prove Theorem 5.2.2. 

Proof. It is clear that & LP < b* and therefore linin^^ b LP /n < linin^^ b* /n. To 
prove the upper bound, partition the interval (0, 1] into k > 2 subintervals of equal 
length. Let Nj be the set of items whose size w satisfies 22 < w < l and let 
\Nj = nj, j = 1,2, ... ,k. We construct a new Bin-Packing Problem where item 
sizes take only the values j-, j — 1,2, . . . , k — 1 and where the number of items 
of size | is iriin{nj, rij + i}, j = 1, 2, . . . , k — 1. We refer to this instance of the 
Bin-Packing Problem as the reduced instance. For this reduced instance, define b* , 
fe LP and 6^, p to be the obvious quantities. 

It is easy to see that we can always construct a feasible solution to the original 
Bin-Packing Problem by solving the Bin-Packing Problem defined on the reduced 
instance and then assigning each of the remaining items to a single bin. This results 
in: 

fc-i 

b* < b* + ^2 \ n 3 — n j+ 1| + n k 

i = 1 

k - 1 

< 6^) P + k + "^2 | nj — rij+i | + nk (using Lemma 5.2.3) 

i-t 

k - 1 

ft 6 PP + k + ^ ( I Tlj — n j+l | + klk- 

3 = 1 

We now argue that 5 LP < 6 LP . This must be true since every item in the reduced 
instance can be associated with a unique item in the original instance whose size 
is at least as large. Thus, every feasible solution to the linear relaxation of the 
set-partitioning problem defined on the original instance is feasible for the same 
problem on the reduced instance. Hence, 

k - 1 

b* < b LF + k + ^2 \ n j _ n j+ 1 1 + nk- 
i = 1 
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The Strong Law of Large Numbers and the Mean Value Theorem imply that 
for a given j = 1, ... ,k, there exists Sj such that 



lim ^ 

n—>oo nk 



where <j> is the density of item sizes. Hence, 



lim —I rij 

n — >oc Tl 



Consequently, 



n i+i I = - <t>(Sj+ 1)| 

< jK(sj . |_i — Sj) q (by Lipschitz continuity) 

rC 

2 / 2 \ 

< (since s j+1 - Sj < 

2 

— (since g > 1). 

/c 



& LP AT2fc-l) 

lim — < lim 1 . 

TL * OO 77, q-i — ►oo Tt k 



Since this holds for arbitrary k, this completes the proof. I 

In fact, it appears that the linear relaxation of the set-partitioning formulation 
may be extremely close to the optimal solution in the case of the Bin-Packing 
Problem. Recently Chan et al. (1998) show that the worst-case effectiveness of the 
set-partitioning lower bound (the linear relaxation), that is, the maximum ratio of 
the optimal integer solution (6*) to the optimal linear relaxation & LP , is |. They 
also provide an example achieving this bound. That is, for any number of items 
and any set of item weights, the linear program is at least 75% of the optimal 
solution. 



5.3 Lagrangian Relaxation 

In 1971, Held and Karp applied a mathematical technique known as Lagrangian 
relaxation to generate a lower bound on a general integer (linear) program. Our 
discussion of the method follows the elegant presentation of Fisher (1981). We 
start with the following integer program. 

Problem P : Z = Min cx 

s.t. 

Ax = 6, 

Dx < e, 

x > 0 and integer, 



(5.3) 

(5.4) 
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where x is an n- vector, b is an m- vector, e is a fc-vector, A is an m x n matrix and 
D is a k x n matrix. Let the optimal solution to the linear relaxation of Problem P 
be .Zlp- The Lagrangian relaxation of constraints (5.3) with multipliers u £ M m 
is: 



Problem LR U : Zd(u) = Min cx + u(Ax — b) 

s.t. 



Dx < e, 

x > 0 and integer. 



(5.5) 



The following is a simple observation. 

Lemma 5.3.1 For all u £ lR m , Zd(k) < Z. 

Proof. Let x be any feasible solution to Problem P. Clearly, x is also feasible for 
LR U and since Zd(u) is its optimal solution value, we get 

Z D (u) < cx + u(Ax — b) = cx. 

Consequently, Zr>(u) < Z. I 

Remark: If the constraints Ax = b in Problem P are replaced with the constraints 
Ax < b , then Lemma 5.3.1 holds for u £ IR'V. 

Since Zd(u) < Z holds for all u, we are interested in the vector u that provides 
the largest possible lower bound. This is achieved by solving Problem D, called 
the Lagrangian dual , defined as follows. 

Problem D : Z^ > = max u Zj^(u). 

Problem D has a number of important and interesting properties. 

Lemma 5.3.2 The function Zd(u) is a piecewise linear concave function of u. 

This implies that Zd(u) attains its maximum at a nondifferentiable point. This 
maximal point can be found using a technique called subgradient optimization 
which can be described as follows: given an initial vector u° the method generates 
a sequence of vectors {u fc } defined by 

u k+l = u k + t k {Ax k ~b), (5.6) 

where x k is an optimal solution to Problem LR u k and tu is a positive scalar called 
the step size. Polyak (1967) shows that if the step sizes ti,t%, ■ ■ ■ , are chosen such 
that linifc^oo tk = 0 and ]C fc > 0 tk is unbounded, then Zn(u k ) converges to Z d- 
The step size commonly used in practice is 

\ k (UB - Z u (u k )) 
k ElLiK^-M 2 ’ 
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where UB is an upper bound on the optimal integer solution value (found using a 
heuristic), a^x k — bi is the difference between the left-hand side and the right-hand 
side of the i th constraint in Ax k < b , and is a scalar satisfying 0 < A*, < 2. 
Usually, one starts with Ao = 2 and cuts it in half every time Zp,(u) fails to increase 
after a number of iterations. 

It is now interesting to compare the Lagrangian relaxation lower bound (Zp>) to 
the lower bound achieved by solving the linear relaxation of the set-partitioning 
formulation (Zpp). 

Theorem 5.3.3 

Zlp < Zv. 



Proof. 



Zp, = max \ minca; + u(Ax — b) Dx < e, x > 0 and integer > 

U l X J 

> max < min cx + u(Ax — b) Dx < e,x > 0> 

u Lx J 



= max max \ve — ub 

U V 



{• 



{ 



= max <ve — ub 



vD < c + uA , v < 0 j (by strong duality) 
vD < c + uA, v < o| 



in |cy Ay = b , Dy < e, y > o| (by strong duality) 
= Z LP . 



= mm 

v 



We say a mathematical program P possesses the integrality property if the solu- 
tion to the linear relaxation of P always provides an integer solution. Inspection 
of the above proof reveals the following corollary. 

Corollary 5.3.4 If Problem LR U possesses the integrality property, then Zp, = 
Z^p. 

5.4 Lagrangian Relaxation and the Traveling Salesman 
Problem 

Held and Karp (1970, 1971) developed the Lagrangian relaxation technique in the 
context of the Traveling Salesman Problem. They show some interesting relation- 
ships between this method and a graph-theoretic problem called the minimum 
weight 1-tree problem. 
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5-4-1 The 1-Tree Lower Bound 

We start by defining a 1-tree. For a given choice of vertex, say vertex 1, a 1-tree is 
a tree having vertex set {2, 3, , n} together with two distinct edges connected to 
vertex 1. Therefore, a 1-tree is a graph with exactly one cycle. Define the weight 
of a 1-tree to be the sum of the costs of all its edges. In the minimum weight 1-tree 
problem the objective is to find a 1-tree of minimum weight. Such a 1-tree can be 
constructed by finding a minimum spanning tree on the entire network excluding 
vertex 1 and its corresponding edges, and by adding to the minimum spanning 
tree the two edges incident to vertex 1 of minimum cost. 

We observe that any traveling salesman tour is a 1-tree tour in which each 
vertex has a degree 2. Moreover, if a minimum weight 1-tree is a tour, then it is 
an optimal traveling salesman tour. Thus, the minimum weight 1-tree provides a 
lower bound on the length of the optimal traveling salesman tour. 

Unfortunately, this bound can be quite weak. However, there are ways to improve 
it. For this purpose consider the vector tt = {7Ti, 7r2, . . . , 7r n } and the following 
transformation of the distances {dij}: 

d 'l j dij T 7Tj T TTj. 

Let L* be the length of the optimal tour with respect to the distance matrix 
{dij}. It is clear that the same tour is also optimal with respect to the distance 
matrix {d' t] }. To see that observe that any traveling salesman tour S of cost L 
with respect to {dij} has a cost L + 2^" =1 7r; with respect to {db}. Thus, the 
difference between the length of any traveling salesman tour in {dij} and {db} is 
constant, independent of the tour. 

Observe also that the above transformation of the distances does change the 
minimum 1-tree. How can this idea be used? First, enumerate all possible 1-trees 
and let d \ be the degree of vertex i in the fc th 1-tree. Let Tk be the weight (cost) of 
that 1-tree (before transforming the distances). This implies that the cost of that 
1-tree after the transformation is exactly 

Tk + ^ d}} Tti- 

iev 

Thus, the minimum weight 1-tree on the transformed distance matrix is obtained 
by solving 

min j ' U , + y^rff7Tj j . 

ieV 

Since, in the transformed distance matrix, the optimal traveling salesman tour 
does not change while the 1-tree provides a lower bound, we have 

L* + 2 7T.j > mm |u fc + ^ d^n, j , 
iev iev 

L* > min j T k + ^(d,f — 2)7Tj j = w(tt). 
iev 



which implies 
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Consequently, the best lower bound is obtained by maximizing the function w(n) 
over all possible values of n. How can we find the best value of 7r? Held and Karp 
(1970, 1971) use the subgradient method described in the previous section. That 
is, starting with some arbitrary vector 7 r°, in step k the method updates the vector 
7 r k according to 

n k+1 = n k + t k (d k -2), 

where is the i th element in the vector 7r fc and tk, the step size, equals 

, _ X k (UB - W (TT k )) 

k Er=iK fc -2) 2 ' 

5-4-2 The 1-Tree Lower Bound and Lagrangian Relaxation 

We now relate the 1-tree lower bound to a Lagrangian relaxation associated with 
the following formulation of the Traveling Salesman Problem. For every e € E, 
let d e be the cost of the edge and let x e be a variable that takes on the value 1 if 
the optimal tour includes the edge and the value zero, otherwise. Given a subset 
S C V, let E(S) be the set of edges from E such that each edge has its two end- 
points in S. Let 8(S) be the collection of edges from E in the cut separating S 
from V\S. The Traveling Salesman Problem can be formulated as follows: 

Problem P' : Z* = Min ^ d e x e 

ee E 

s.t. 

53 x e = 2, Vi = 1, 2, . . . , n (5.7) 

eES(i) 

53 < |5|-1, VSCV\{1},5^0 (5.8) 

eGE(S) 

0 < *e < 1, Ve G E (5.9) 

x e integer, Ve € E. (5.10) 

Constraints (5.7) ensure that each vertex has an edge going in and an edge 
going out. Constraints (5.8), called subtour elimination constraints, forbid integral 
solutions consisting of a set of disjoint cycles. 

Observe that constraints (5.7) can be replaced by the following constraints. 

53 £e = 2, Vz = l,...,n — 1 (5-11) 

53 x e = n. (5-12) 

e£E 

This is true since constraints (5.11) are exactly constraints (5.7) for i = 1 , . . . , n— 1. 
The only missing constraint is E e ei 5 (n) x e = 2- Therefore, it is sufficient to show 
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that (5.12) holds if and only if this one holds. To see this: 

Xe = lit, 2:6 

e£E i — 1 eG<5(i) 

i = 1 eGS(i) e£S(n) 

= (n-l) + ^ ^ a; e . 

eG^(n) 



Thus, J2eGE X e = H if and only if J2eeS(n) X e = 2 - 

The resulting formulation of the Traveling Salesman Problem is 



Min 



e£E 



(5.8), (5.9), (5.10), (5.11) and (5.12)}. 



We can now use the Lagrangian relaxation technique described in Section 5.3 and 
get the following lower bound on the length of the optimal tour. 



max 

U 



i,jev 



+ Ui + Uj)Xij 



(5.8), (5.9), (5.10) and (5.12)}. 



Interestingly enough, Edmonds (1971) showed that the extreme points of the 
polyhedron defined by constraints (5.8), (5.9), (5.10) and (5.12) is the set of all 1- 
trees; that is, the optimal solution to a linear program defined on these constraints 
must be integral. Thus, we can apply Corollary 5.3.4 to see that, the lower bound 
obtained from the 1-tree approach is the same as the linear relaxation of Problem 
P'. 



5.5 The Worst-Case Effectiveness of the 1-tree Lower 
Bound 

We conclude this chapter by demonstrating that the Held and Karp (1970, 1971) 
1-tree relaxation provides a lower bound that is not far from the length of the 
optimal tour. For this purpose, we show that the Held and Karp lower bound can 
be written as follows. 



Problem HK : Zlp = Min d e x e 

e£E 

s.t. 

X e = 2, Vi = 1,2, . . . ,n 

eG<5(i) 



(5.13) 
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Y x e > 2, VSCV\{l},Sy0 (5.14) 

ee<5(S) 

0 < x e < 1, Ve £ E. (5.15) 

Lemma 5.5.1 The linear relaxation of Problem P' is equivalent to Problem HK. 

Proof. We first show that any feasible solution x to the linear relaxation of Problem 
P' is feasible for Problem HK. Since E e es < |<5| — 1, E e e£(u\S) — n ~ l^l — 1 

and E eeE(v) x e = n (why?) we get E e6 5(S) > 2. 

Similarly, we show that any feasible solution x to Problem HK is feasible for the 
linear relaxation of Problem P' . The feasibility of x in Problem HK implies that 
E ie sEee<5(i) = 2|S|. However, 

Y Y * e = 2 ie+ X! = 2 i‘S'i, 

i€S eGS(i) e£E(S ) e£6(S) 

and since J2eeS(S) ^ 2 > we S et T,eeE(S) S 1*51 - 1- ■ 

Shmoys and Williamson (1990) have shown that the Held and Karp lower bound 
(Problem HK) has a particular monotonicity property, and as a consequence, they 
obtain a new proof of an old result from Wolsey (1980) who showed: 

Theorem 5.5.2 For every instance of the TSP for which the distance matrix 
satisfies the triangle inequality, we have Z* < %Zpp. 

The proof presented here is based on the monotonicity property established 
by Shmoys and Williamson (1990). However, we use a powerful tool discovered 
by Goemans and Bertsimas (1993), called the parsimonious property. This is a 
property that holds for a general class of network design problems. 

To present the property consider the following linear program defined on the 
complete graph G = (V,E). Associated with each vertex i £ V is a given number 
Vi which is either zero or two. Let V-i = {i £ V\ri = 2}. 

We will analyze the following linear program (here ND stands for network 
design) . 

Problem ND : Min Y^ d e x e 

eOE 

s.t. 

~Y , x e = ri, Vi = 1,2 , ...,n (5.16) 

eG<5(i) 

Y x e > 2, vs c v, v 2 n S' ± 0, 

ee<5(S) 

T 2 n(T\5)/0 (5.17) 

0 < * e < 1, Ve £ E. (5.18) 

It is easy to see that when V 2 = V this linear program is equivalent to the linear 
program Problem HK. We now provide a short proof of the following result. 
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Lemma 5.5.3 The optimal solution value to Problem ND is unchanged if we omit 
constraint (5.16). 

Our proof is similar to the proof presented in Bienstock and Simchi-Levi (1993); 
see also Bienstock et al. (1993), which uses a result of Lovasz (1979). In his book 
of problems, (Exercise 6.51) Lovasz presents the following result, together with a 
short proof. But first, we need a definition. 

Definition 5.5.4 An undirected graph G is k— connected between two vertices i 
and j if there are k (node) disjoint paths between i and j. 

Lemma 5.5.5 Let G be an Eulerian multigraph and s € V(G), such that G is 
k— connected between any two vertices different from s. Then , for any neighbor u 
of s, there exists another neighbor w of s, such that the multigraph obtained from 
G by removing {s,zt} and {s,tu}, and adding a new edge {zz,tt>} (the splitting-off 
operation) is also k-connected between any two vertices different from s. 

Lovasz’s proof of Lemma 5.5.5 can be easily modified to yield the following. 

Lemma 5.5.6 Let G be an Eulerian multigraph, Y C V(G) and s £ V(G), such 
that G is k-connected between any two vertices of Y different from s. Then, for 
any neighbor u of s, there exists another neighbor w of s, such that the multigraph 
obtained from G by removing {s,w} and {s,w}, and adding a new edge {zz,u>} is 
also k-connected between any two vertices ofY different from s. 

We can now prove Lemma 5.5.3. 

Proof. Let Vo = V\V 2 ] that is, V$ = {i £ V\ r* = 0}. Let Problem ND' be Problem 
ND without (5.16). Finally, let x be a rational vector feasible for Problem ND', 
chosen such that (z) x is optimal for Problem ND', and (ii) subject to (z), Y^eeE 
is minimized. 

Let Al be a positive integer, large enough so that v = 2 Mx is a vector of even 
integers. We may regard v (with a slight abuse of notation) as the incidence vector 
of the edge-set E of a multigraph G with vertex set V. Clearly, G is Eulerian, and 
by (5.17), it is AM — connected between any two elements of Vi- 

Now suppose that for some vertex s, ]Cee5({s}) > r s (he., s has a degree 

larger than 2 AIr s in G). Let us apply Lemma 5.5.6 to s and any neighbor u of s 
(where Y = V 2 ), and let H be the resulting multigraph, with incidence vector z. 
Clearly, 

^ ) d e z e £ ^ ( d G v e , 

e&E eeE 

and so 

^ dexyy < T d e x e . 

2M ~ e e 

eeE eeE 



Moreover, 



z, 



1 
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Hence, by the choice of x, z = jfj cannot be feasible for Problem ND'. 

If s G Vo, then by Lemma 5.5.6, z is feasible for Problem ND'. Thus, we must 
have s £ V 2 and, in fact, ]C e e< 5 ({t}) = 0 for all t G Vq. In other words, E spans 

precisely V 2 , G is 4M— connected and XZ e e<5({s}) — 4M + 2. But we claim now 
that the multigraph H is AM— connected. For by Lemma 5.5.6, it could only fail 
to be AM— connected between s and some other vertex, but the only possible cut 
of size less than AM is the one separating s from P\{s}. Since this cut has at least 
AM edges, the claim is proved. Consequently, again we obtain that 2 is feasible for 
Problem ND', a contradiction. In other words, YheoE^e = 2 Mr* for all i; that is, 
(5.16) holds. I 

An immediate consequence of Lemma 5.5.3 is that in Problem HK, one can 
ignore constraint (5.13) without changing the value of its optimal solution. This 
new formulation reveals the following monotonocity property of the Held and Karp 
lower bound: let A C V and consider the Held and Karp lower bound on the length 
of the optimal traveling salesman tour through the vertices in A ; that is, 

Problem HK(A) : Zpp(A) = Min d e x e 

eeE 

s.t. 



y: x e > 2 , 


VS c A, 


(5.19) 


ee<5(S) 






0<x e <l, 


Ve G E. 


(5.20) 



Since any feasible solution to problem HK(P) is feasible for problem HK(A), the 
cost of this linear program is monotone with respect to the set of nodes A. 

We are ready to prove Theorem 5.5.2. 

Proof. Section 3.3.3 presents and analyzes the heuristic developed by Christofides 
for the TSP which is based on constructing a minimum spanning tree plus a 
matching on the nodes of odd degree. Observe that a similar heuristic can be 
obtained if we start from a 1-tree, instead of a minimum spanning tree. Thus, the 
length of the optimal tour is bounded by W(T-f) + W (M* (A)) where W{T f) is 
the weight (cost) of the best 1-tree and W(M*(A)) is the weight of the optimal 
weighted matching defined on the set of odd degree nodes in the best 1-tree, 
denoted by A. 

We argue that W(M*(A)) < ^Zlp(A). Let x be an optimal solution to Problem 
HK(A). It is easy to see that the vector ^x is feasible for the following constraints. 

x e = l, ViG A (5.21) 

e£<5(i) 

*e< ^(|S]-l), VScAS/l, |S| >3, |S| is odd (5.22) 

eOE(S) 



0 < x e < 1, Ve £ E. 



(5.23) 
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A beautiful result of Edmonds (1965) tells us that these constraints are sufficient 
to formulate the matching problem as a linear program. Consequently, 



and therefore, 



W(M*(A)) < l -Z LP {A) < l -Z LP {V ) = l -Z LP 



L* < W (T * ) + W(M*(A)) 



5.6 Exercises 



< Z LP + —Zpp 

<A,, 



Exercise 5.1. Prove Lemma 5.3.2. 



Exercise 5.2. Show that a lower bound on the cost of the optimal traveling 
salesman tour can be given by: 



7777 max 

\N\ ieN 



y dijt 

jGN 



where N is the set of cities and dij is the distance from city i to city j. 



Exercise 5.3. Consider an instance of the Bin-Packing Problem where there are 
rrij items of size Wj € (0, 1] for j = 1,2, ... ,n. Define a bin configuration to be 
a vector c = (ci, c-i, . . . , c n ) with the property that c* > 0 for i = 1 , 2 , . . . , n and 
c j w j — 1- Enumerate all possible bin configurations. Let there be M such 
configurations. Define Cjk to be the number of items of size Wj in bin configuration 

k, for k = 1, 2, . . . , M and j = 1, 2, . . . , n. 

Formulate an integer program to solve this Bin-Packing Problem using the 
following variables: Xk is the number of times configuration k is used, for k = 

l, 2,..., M. 



Exercise 5.4. A function u : [0, 1] — > [0, 1] is dual-feasible if for any sets of numbers 
w\, u> 2 , ■ ■ ■ , Wk, we have 

k k 

y < i => yu(wj) < i. 

2 = 1 2 = 1 

(a) Given an instance of the Bin-Packing Problem with item sizes Wi, W 2 , ■ ■ ■ , w n 
and a dual- feasible function u, prove that u(wi) < b* . 
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(6) Assume n is even. Let half of the items be of size | and the other half of size 
Find a dual-feasible function u that satisfies: 

n 

Y,u(wi) = b*. 

i = 1 



Exercise 5.5. Consider a list L of n items of sizes in (|, |], Let 6 LP be the optimal 
fractional solution to the set-partitioning formulation of the Bin-Packing Problem, 
and let b* be the optimal integer solution to the same formulation. Prove that 

b* < b hp + 1. 



Exercise 5.6. Prove that if a graph has exactly 2 k vertices of odd degree, then 
the set of edges can be partitioned into k paths such that each edge is used exactly 
once. 




Part II 



INVENTORY MODELS 
MODELS 




6 



Economic Lot Size Models with Constant 
Demands 



6.1 Introduction 

Production planning is also an area where difficult combinatorial problems appear 
in day to day logistics operations. In this chapter, we analyze problems related 
to lot sizing when demands are constant and known in advance. Lot sizing in 
this deterministic setting is essentially the problem of balancing the fixed costs of 
ordering with the costs of holding inventory. In this chapter, we look at several 
different models of deterministic lot sizing. First we consider the most basic single- 
item model, the Economic Lot Size Model. Then we look at coordinating the 
ordering of several items with a warehouse of limited capacity. Finally, we look at 
a one-warehouse multiretailer system. 



6.1.1 The Economic Lot Size Model 

The classical Economic Lot Size Model, introduced by Harris (1915) (see Erlenkot- 
ter (1990) for an interesting historical discussion), is a framework where we can see 
the simple tradeoffs between ordering and storage costs. Consider a facility, possi- 
bly a warehouse or a retailer, that faces a constant demand for a single item and 
places orders for the item from another facility in the distribution network which 
is assumed to have an unlimited quantity of the product. The model assumes the 
following. 

• Demand is constant at a rate of D items per unit time. 

• Order quantities are fixed at Q items per order. 
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FIGURE 6.1. Inventory level as a function of time. 

• A fixed set-up cost K is incurred every time the warehouse places an order. 

• A linear inventory carrying cost h, also referred to as holding cost, is accrued 
for every unit held in inventory per unit time. 

• The lead time, that is, the time that elapses between the placement of an 
order and its receipt, is zero. 

• Initial inventory is zero. 

• The planning horizon is infinite. 

The objective is to find the optimal ordering policy minimizing total purchasing 
and carrying cost per unit of time without shortage. 

Like all models, this is a simplified version of what might actually occur in 
practice. The assumption of a known fixed demand over the infinite horizon is 
clearly unrealistic. Lead time is most likely positive, and the requirement of a fixed 
order quantity is restrictive. As we shall see, all these assumptions can be easily 
relaxed while maintaining a relatively simple optimal policy. For the purposes of 
understanding the basic tradeoffs in the model, we keep the assumptions listed 
above. 

It is easy to see that an optimal ordering policy must satisfy the Zero Inventory 
Ordering Property which says that every order is received precisely when the inven- 
tory level drops to zero. This can be seen by considering the case where an order 
is placed when the inventory level is not zero. In that case, cost is not increased if 
we simply wait until inventory is zero to order. 
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To find the optimal ordering policy in the Economic Lot Size Model, we con- 
sider the inventory level as a function of time (see Figure 6.1). This is the so-called 
saw-toothed inventory pattern. We refer to the time between two successive re- 
plenishments as a cycle time. Thus, total inventory cost in a cycle of length T 
is 



K + 



hTQ 

2 



and since Q = TD, the average total cost per unit of time is 



KD hQ 

~Q~ + ^T' 



Hence, the optimal order quantity is 



Q* = 



2KD 



This quantity is referred to as the Economic Order Quantity (EOQ) and it is the 
quantity at which inventory set-up cost per unit of time (^r) equals inventory 
holding cost per unit of time (qp). 

We now see how some of our assumptions can be relaxed, without losing any of 
the simplicity of the model. Consider the case in which initial inventory is positive, 
say at level Iq', then the first order for Q* items is simply delayed until time ^ . 
Further, the assumption of zero lead time can also be easily relaxed. In fact, the 
model can handle any deterministic lead time L. To do this simply place an order 
for Q* items when the inventory level is DL. On the other hand, relaxing the 
assumptions of fixed demands and infinite planning horizon requires significant 
changes to the above solution. 



6.1.2 The Finite Horizon Model 

To make the model more realistic, we now introduce a finite horizon, say t. For 
instance, in the retail apparel industry, such a horizon may represent an 8-12 week 
period, for example, the “winter season,” in which demand for the product might 
be assumed to be constant and known. We also relax the assumption that the 
order quantities are fixed. We seek an inventory policy on the interval [0, t] that 
minimizes ordering and carrying costs. 

For this purpose, consider any inventory policy, say V , that places m > 1 orders 
in the interval [0, t\ . Clearly, the first order must be placed at time zero and the last 
must be placed so that the inventory at time t is zero. For any i, 1 < i < m— 1, let 
T-i be the time between the placement of the i th order and the («+l) st order and let 
T m be the time between the placement of the last order and t. Thus, by definition, 
t = Y^iLi Ti, and V places the j th order at time T, for 1 < j < m. Again, 

it is clear that the policy V must satisfy the Zero Inventory Ordering Property. 
Figure 6.2 illustrates the inventory level of the policy V. 
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FIGURE 6.2. Inventory level as a function of time under policy V ■ 

For the policy V , let 7(r) be the inventory level at time r G [0 . t] . Thus, the 
total cost per unit of time associated with V is 



1 ' 
t . 



Km + h 







The only thing we know about the function I(r) is that it decreases at a rate of D 
(a slope of — D ) between orders and reaches zero exactly ?n times. Thus, we can 
express the total inventory up to time t as a function of the time between orders 
as follows. 



V- Ti ■ DTi 

^ 2 

i=l 



D 

~2 









Consequently, if m orders are placed we can find the best times to place them by 
solving: 

771 771 

Min, { T i | T i = t, ^ > 0, Vi = 1, 2, . . . , to}. 

i=l i=l 



The optimal solution to this convex optimization problem is Ti = ^ for each 
i = 1,2,..., to. Hence, an optimal policy must have the following property. 



Property 6.1.1 For a problem with one product over the interval [0, <] , the inven- 
tory policy with minimum cost that places m orders is achieved by placing orders 
of equal size at equally spaced points in time. 
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The property thus implies that total purchasing and carrying cost per unit time 
associated with V is at least 



Km hDt 



t 2m 

Consequently, by selecting the value of m that minimizes this value we can con- 
struct a policy of minimal cost. Let 




and thus the best value of m is either |_cej or [a] , depending on which yields 
smaller cost. Thus our policy in the finite horizon case is in fact very similar to 
the infinite horizon case. Orders are placed at regularly spaced intervals of time, 
and of course the orders are of the same size each time. 



6.1.3 Power of Two Policies 

Consider the infinite horizon model described in Section 6.1. For this model we 
know that average total cost per unit of time is 



KD hQ 



K 

T 



hTD 

2 



= nn 



where T is the time between orders. In this subsection, following Muckstadt and 
Roundy (1993), we introduce a new class of policies called power-of-two policies. 

To simplify the analysis, and in accordance with the notation used in the lit- 
erature (see Roundy, 1985 and Muckstadt and Roundy, 1993), let g = ^ and 
hence 

f(T) = *+gT. 

Observe that the function f(T ) motivates another interpretation of the model. We 
can consider the problem to be an Economic Lot Size model with unit demand 
rate, that is, D = 1, and inventory holding cost 2 g. The optimal reorder interval 
is T* = pf and total cost per unit time is f(T*) = 2 y/Kg. 

One difficulty with the Economic Lot Size Model is that the optimal reorder 
interval T* may take on any value and thus might lead to highly impractical 
optimal policies. For instance, reorder intervals of \/3 days, or y/ir weeks would 
not be easy to implement. That is, the model might specify that orders be placed 
on Monday of one week, Thursday of the next, Tuesday of the next week etc., a 
schedule of orders that may not have an easily recognizable pattern. Therefore, it 
is natural to consider policies where the reorder interval T is restricted to values 
that would entail easily implementable policies. One such restriction is termed the 
power of two restriction. In this case, T is restricted to be a power of two multiple 
of some fixed base planning period T B : that is, 



T = T B 2 k , fee {0,1, 2, 3,...}. 



( 6 . 1 ) 
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Such a policy is called a power of two policy. The base planning period T B may 
represent a day, week, month, etc. and is usually fixed beforehand. It represents 
the minimum possible reorder interval. 

Restricting ourselves to power of two policies requires addressing the following 
issues. 



• How does one find the best power of two policy, the one minimizing the cost 
over all possible power of two policies? 

• How far from optimal is the best policy of this type? 

We start by answering the first question. Let T* = be the optimal (unre- 
stricted) reorder interval and let T be the optimal power of two reorder interval. 
Since / is convex, the optimal k in (6.1) is the smallest integer k satisfying 

f(T B 2 k ) < f(T B 2 k+1 ), 



or 



K 

T b 2 k 



+ gT B 2 k < 



I\ 

T B 2 k+1 



+ gT B 2 k+l . 



Hence, k is the smallest integer such that 




1 

71 



T* < T b 2 k = T. 



Thus, finding the optimal power of two policy is straightforward. 

Observe that by the definition of the optimal k, it must also be true that 



T = T b 2 k < 




V2T*, 



and hence the optimal power of two policy, for a given base planning period T B , 
must be in the interval [^T*, \[2 T*]. It is easy to verify that 



/(7=T*) = f{Vzr) -- 

and hence, since / is convex, we have 

f(T) 1/1 
f(T*) ~ 2 V V2 



i(-L + C2)/en, 

y/2 ) « 1.06. 



Consequently, the average inventory purchasing and carrying cost of the best power 
of two policy is guaranteed to be within 6% of the average cost of the overall 
minimum policy. The reader can see that this property is a result of the “flatness” 
of the function f around its minimum. 

This restriction, to powers of two multiples of the base planning period, will also 
prove to be quite useful later in a more general setting. 
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6.2 Multi- Item Inventory Models 

6. 2. 1 Introduction 

The previous models established optimal inventory policies for single item models. 
It is simple to show that without the presence of joint order costs, a problem with 
several items each facing a constant demand can be handled by solving each item’s 
replenishment problem separately. In reality, management of a single warehouse 
inventory system involves coordinating inventory orders to minimize cost without 
exceeding the warehouse capacity. The warehouse capacity limits the total volume 
held by the warehouse at any point in time. This constraint ties together the dif- 
ferent items and necessitates careful coordination (or scheduling) of the orders. 
That is, it is not only important to know how often an item is ordered, but ex- 
actly the point in time at which each order takes place. This problem is called the 
Economic Warehouse Lot Scheduling Problem (EWLSP). The scheduling part, 
hereafter called the Staggering problem, is exactly the problem of time-phasing 
the placement of the orders to satisfy the warehouse capacity constraint. Unfor- 
tunately, this problem has no easy solution and consequently it has attracted a 
considerable amount of attention in the last three decades. 

The earliest known reference to the problem appears in Churchman et al. (1957) 
and subsequently in Holt (1958) and Hadley and Whitin (1963). These authors 
were concerned with determining lot sizes that made an overall schedule satisfy 
the capacity constraint, and not with the possibility of phasing the orders to avoid 
holding the maximum volume of each item at the same time. Thus, they only con- 
sidered what are called Independent Solutions, wherein every item is replenished 
without any regard for coordination with other items. 

Several authors considered another class of policies called Rotation Cycle policies 
wherein all items share the same order interval. Homer (1966) showed how to 
optimally time-phase (stagger) the orders to satisfy the warehouse constraint for a 
given common order interval. Page and Paul (1976), Zoller (1977) and Hall (1988) 
independently rediscovered Homer’s result. At the end of his paper devoted to 
Rotation Cycle policies, Zoller indicates the possibility of partitioning the items 
into disjoint subsets, or clusters, if the assumption of a Rotation Policy “proves 
to be too restrictive.” This is precisely Page and Paul’s partitioning heuristic. 
In their heuristic, all the items in a cluster share a common order interval. The 
orders are then optimally staggered within each cluster, but no attempt is made to 
time-phase the orders of different clusters. Goyal (1978) argued that such a time- 
phasing across the different clusters may lead to further reduction in warehouse 
space requirements. Hartley and Thomas (1982) and Thomas and Hartley (1983) 
considered the two-item case in detail. 

Recently a number of studies have been concerned with the strategic version 
of the EWLSP in which the warehouse capacity is not a constraint but rather a 
decision variable. These include Hodgson and Howe (1982), Park and Yun (1985), 
Hall (1988), Rosenblatt and Rothblum (1990) and Anily (1991). In this model, 
the inventory carrying cost consists of two parts; one part is proportional to the 
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average inventory while the second part is proportional to the peak inventory. A 
component of the latter cost, discussed in Silver and Peterson (1985), is the cost 
of leasing the storage space. This cost is typically proportional to the size of the 
warehouse, and not to the inventories actually stored in it. 

Define a policy to be a Stationary Order Size policy if all replenishments of 
an item are of the same size. Likewise, a Stationary Order Intervals policy has 
all orders for an item equally spaced in time. It is easily verified that an optimal 
Stationary Order Size (respectively, Stationary Order Interval) policy is also a 
Stationary Order Interval (respectively, a Stationary Order Size) policy if every 
order of an item is received precisely when the inventory of that item drops to zero; 
that is, it also satisfies the Zero Inventory Ordering property. Thus, it is natural to 
consider policies that have all three properties: Stationary Order Size, Stationary 
Order Interval and Zero Inventory Ordering. We call such policies Stationary Order 
Sizes and Intervals policies, in short, SOSI policies. Two “extreme” cases of SOSI 
policies are the Independent Solutions and the Rotation Cycle policies defined 
above. All the authors cited above considered SOSI policies exclusively. Zoller 
claims that SOSI policies are the only rational alternative, and most authors agree 
that SOSI policies are much easier to implement in practice. In his Ph.D. thesis, 
however, Hariga (1988) investigated both time-variant and stationary order sizes. 
He was motivated to study time-variant order sizes by their successful application 
in resolving the feasibility issue in the Economic Lot Scheduling Problem (ELSP) 
(see Dobson (1987)). 

The paper by Anily departs from earlier work on the EWLSP in its focus on 
worst-case performance of heuristics. In her paper, Anily restricts herself to the 
class of SOSI policies for the strategic model. She proves lower bounds on the 
minimum required warehouse size and on the total cost for this class of policies. 
She presents a partitioning heuristic of which the best Independent Solution and 
the best Rotation Cycle policies are special cases. This partitioning heuristic is 
similar to the one proposed by Page and Paul for the tactical model, although the 
precise methods for finding the partition are different. Anily proves that the ratio 
of the cost of the best Independent Solution to her lower bound is at most y/2. She 
also provides a data-dependent bound for the best Rotation Cycle, derived from 
Jones and Inman’s (1989) work on the Economic Lot Size Problem. As a result, 
her partitioning heuristic is at least as good as either special case, and thus has a 
worst case bound of \/2 relative to SOSI policies. 

In this section we determine easily computable lower bounds on the cost of the 
EWLSP as well as some simple heuristics for the problem. These bounds are used 
to determine the worst-case performance of these heuristics on different versions 
of the problem. First, in Section 6.2.2, we introduce notation, state assumptions 
and formally define the strategic and tactical versions of the EWLSP. In Section 
6.2.3, we establish the worst-case results. The discussion in this section is based 
on the work of Gallego et al. (1996). 
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6.2.2 Notation and Assumptions 

Let N = {1, 2, . . . , n} be a set of n items each facing a constant unit demand rate 
(this can be done without loss of generality) . An ordering cost Ki is incurred each 
time an order for item i is placed. A linear holding cost 2 hi is accrued for each 
unit of item i held in inventory per unit of time. Demand for each item must be 
met over an infinite horizon without shortages or backlogging. 

The volume of inventory of item i held at a given point in time is the product 
of its inventory level at that time and the volume usage rate of item i, denoted by 
7i > 0. The volume usage rate is defined as the volume displaced by one unit of 
item i. Without loss of generality, we select the unit of volume so that 5Z"=i 7 » = 1- 

The objective in the strategic version of the EWLSP is to minimize the long- 
run average inventory carrying and ordering cost plus a cost proportional to the 
maximum volume held by the warehouse at any point in time. Formally, for any 
inventory policy V , let V ( V ) denote the maximum inventory volume held by the 
warehouse and let C(V) be the long-run average inventory carrying and holding 
cost incurred by this policy. Then, the objective is to find a policy V minimizing 

Z{V) = C(V) + V(V). 

The tactical version of the EWLSP has also received much attention in the 
literature. There, the objective is to find a policy V minimizing the long-run av- 
erage inventory carrying and holding costs subject to the inventory always being 
less than the warehouse capacity. Hence, the tactical version can be formulated 
as: find a policy V minimizing C(V) subject to V(V) < v. where v denotes the 
available warehouse volume. 



6.2.3 Worst-Case Analyses 

Preliminaries 

We present here two simple results that are used in subsequent analyses. 

Given a SOSI policy, let T = {Ti,T 2 , . . . ,T n } be the vector of reorder intervals 
where X) is the reorder interval of item i. For any such vector T, let V ( T ) denote 
the maximum volume of inventory held by the warehouse over all points in time. 
The following provides a simple upper bound on V(T). 

Lemma 6.2.1 For any vector T = (T), T 2 , . . . , T n }, we have 

n 

V{T) < 

i-1 



Proof. Clearly, the inventory level of item i, at any moment in time, is no more 
than Ti (recall demand is 1 for all i). I 
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For the next result we need some additional notation. Consider any inventory 
policy V and any time interval [0,i]. Let V(V,t) be the maximum inventory held 
by the warehouse in policy V over the interval [0 , t\ and C(V,t) be the average 
inventory holding and carrying cost incurred over [0, t] . Let rrii be the number of 
times the warehouse places an order for item i over the interval [0, t]. For r € [0, t ] , 
let /j(r) be the inventory level of item i at time r. Let Vi(r ) be the volume of 
inventory held by item i at time r; that is, Uj(r) = 7 j/i(r). Also, let v ( t ) = 
y^._ 1 Vi(r) be the volume of inventory held by the warehouse at time r. 

Lemma 6.2.2 For any inventory policy V and time interval [0,t], we have 



1 

2 



Y — < 

2 -— ' rrii 



n i ft 

Ii{r)dr <V(V,t). 

i= l 1 Jt =° 



Proof. 

gives 



Clearly, v ( t ) < V{V,t) for all r < t. Taking the integral up to time t > 0 

V(V,t)>j f Y v '( T ) dT 

1 ■ 

= -/ V7 */i(r)dr 

T J r = 0 

1 l' t 

= Y] v7i / A(r)dr 
■ 1 J T=0 



> 



E l 

2 mi’ 



where the last inequality follows from Property 6.1.1 which states that when rrii 
orders for a single item are placed over the interval [0, t], the average inventory 
level is minimized by placing equal orders at equally spaced points in time. I 



The Strategic Model 

Consider the following heuristic for the strategic version of the EWLSP. Use the 
vector of reorder intervals T that solves 

= mm { J2 {y + kiTi ) + ^ r ‘ Ti }- 

i i 

Clearly, the vector T can be found in O(n) time by solving n separate Economic 
Lot Scheduling models, and 



Z H = 2^ VYhi + Y- 

i 



( 6 . 2 ) 
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By Lemma 6.2.1, Z B must provide an upper bound on the optimal solution value 
of the strategic model. 

We now construct a lower bound on the optimal solution value over all possible 
inventory policies. The lower bound is the cost of the optimal policy if the ware- 
house cost were based on average inventory rather than maximum inventory. This 
bound will be used to prove the worst-case result. 

Lemma 6.2.3 A lower bound on the optimal solution value over all possible in- 
ventory strategies is given by 

Z lb = 2^ y/Ki(hi + 'n/ 2). (6.3) 



Proof. We show that Z LB < C{P, t) + V(V, t) for all possible inventory policies 
V and for all t > 0. Given an inventory policy V , where ?n,; orders for item i are 
placed over a time interval [0,<], then 

C(V, = (rriiKi + 2 hi f Ii(r)dr) . 

* , v Jt=o ' 



Combining this cost with the lower bound obtained in Lemma 6.2.2 on V(V,t) 
yields the following lower bound on C(V,t ) + V(V,t). 



C(V,t) +V(P,1)>|V' \rriiKi + 2 h t [ /»(r)dr + \ V\i [ Ii(r)dj 

1 i 1 j T—0 J 1 A Jt = 0 

=\X[’ 

i 

>£Wt) + “(.!)] 



’ T—0 

I rriiKi + (2hi + 7 <) / h{r)dr 
J T — 0 



The last inequality again follows from Property 6.1.1. Minimizing the last expres- 
sion with respect to -A- for each i £ N proves the result. I 

We now show that this heuristic is effective in terms of worst-case performance. 



Theorem 6.2.4 



zH 



Z LB 



< y/2. 



Proof. Combining equations (6.2) and (6.3) we get 

_ 2 E l VK l (h. l+ll ) ^ ^ 
Z LB 2 \J Ki(hi + 7,/2) - V “' 
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Can this bound be improved? The following example shows that the bound is 
tight as the number of items grows to infinity. Consider an example n items with 
Ki = K, hi = 0 and 7* = 7 = ^ for all i £ N. Clearly, 

Z H = 2n\/ r Kj. 

We now construct a feasible solution whose cost approaches the lower bound Z LB 
as n goes to infinity. Consider a feasible policy V with identical reorder intervals 
denoted by T. To reduce the maximum volume V(T), we stagger the orders such 
that item i is ordered at times T [ — — + fc] for k > 0. Then the maximum volume 
of inventory is 1 K + 1 ) jvy. Hence, the cost of policy V is 



Z(V) 



riK 

f 



n + 1 
2 



77- 



Minimizing with respect to T gives 

Z(V) = sj2 n(n + l)A^y. 



Consequently, 

Z H ^ Z H 2 

'Z Lb ~ Z(V) ~ y/2n(n + l)^' 

The limit of this last quantity is \/2 (as n goes to infinity) hence, along with 
Theorem 6.2.4, we see that an example can be constructed where the worst-case 
ratio is arbitrarily close to x/2. 



The Tactical Model 

For the tactical version of the EWLSP, a simple heuristic denoted HW first 
proposed by Hadley and Whitin (1963) is to solve 

Problem P HW : C HW = Min ^ + — ) 

^ Ti / 

X 

S.t. 

^2 7 i T i < V, 

i 

T > 0. 

We show that the HW heuristic has a worst-case performance bound of 2 with re- 
spect to all feasible policies. We do so by proving that the solution to the following 
nonlinear program provides a lower bound on the cost of any feasible policy. 

Problem P LB : C LB = Min ^ (hiXi + 

' -L i ' 

% 

S.t. 

2 - l ’’ ( 6 - 4 ) 
i 



T > 0. 
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Lemma 6.2.5 C LB 



is a lower bound on the cost of any feasible inventory policy. 



Proof. Consider any feasible policy V over the interval [0, t] that places m,; orders 
for item i in [0,f]. From Lemma 6.2.2 we have Vf > 0, 



v > V(V,t) > 




The average inventory holding and carrying cost incurred over the interval [0, t\ 
is 



C(V,t)= ^ ^ \mjKj + 2 hi f ft{r )di 



(6.5) 



Again, the last inequality follows from Property 6.1.1. 

Thus, by replacing A- with T; for all i > 1, we see that minimizing (6.5) subject 
to \ JT 'yit/rm < v provides a lower bound on C(V, t). I 

We now prove the worst-case bound. 



Theorem 6.2.6 



qHW 

-(JLB 



< 2 . 



Proof. Let T LB = {T BB ,T BB , . . . ,T BB } be the optimal solution to P LB . Obvi- 
ously, T[ = \T BB is feasible for P HW . Hence, 



C HW < J2 (hfTl + 



- 2 h i T i LB + 2 Y1 

i i 1 

< 2 C LB . 



As in the strategic version, the worst-case bound provided by the above theorem 
can be shown to be tight. To do so, consider the case where all items are identical 
with Ki = AT, hi = 0 and 7 * = 7 = A f or all i £ N. The solution to problem P HW 
is clearly T) = v for all i £ N, so C Hn = — Consider now a feasible policy 

V with identical reorder intervals denoted by T such that an order for item i is 
placed at times T[— — A + k] for k > 0. The maximum volume occupied by policy 

V is So T = is feasible and C(fP) = Hence, 

v C HW riK/v 

lim — — — - = lim — 7 — — = 2 . 

n^oo C(V) n — >00 K(n + l)/2v 
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By performing a similar analysis one can obtain worst-case bounds on the per- 
formance of heuristics for other versions of the EWLSP. For instance, for the Joint 
Replenishment version of the strategic model, where an additional set-up cost K 0 is 
incurred whenever an order for one or more items is placed, the worst-case bound 
of a heuristic, similar to the one described for the EWLSP, can be shown to be \/3. 
The worst-case bound on the tactical version of the Joint Replenishment model 
can be shown to be 2y/2. 



6.3 A Single Warehouse Multi-Retailer Model 

6.3.1 Introduction 

Many distribution systems involve replenishing the inventories of geographically 
dispersed retailers. Consider a distribution system in which a single warehouse 
supplies a set of retailers with a single product. Each retailer faces a constant 
retailer-specific demand that must be met without shortage or backlogging. The 
warehouse faces orders for the product from the different retailers and in turn 
places orders to an outside supplier. A fixed, facility-dependent, set-up cost is 
charged each time the warehouse or the retailers receive an order and inventory 
carrying cost is accrued at each facility at a constant facility-dependent rate. The 
objective is to determine simultaneously the timing and sizes of retailer deliveries 
to the warehouse as well as replenishment strategies at the warehouse so as to 
minimize long-run average inventory purchasing and carrying costs. 

In the absence of a fixed set-up cost charged when the warehouse places an 
order, the problem can be decomposed into an Economic Lot Size model for each 
retailer. That is, the existence of this cost ties together the different retailers 
requiring the warehouse to coordinate its orders and deliveries to the different 
retailers. It is well known that optimal policies can be very complex and thus 
the problem has attracted a considerable amount of attention in recent years (see 
Graves and Schwarz, 1977; Roundy, 1985). The latter paper presents the best 
approach currently available for this model; it suggests a set of power of two 
reorder intervals for each facility and show that the cost of this solution is within 
6% of a lower bound on the optimal cost. In this section, we present this method 
along with the worst-case bound. 

6.3.2 Notation and Assumptions 

Consider a single warehouse (indexed by 0) which supplies n retailers, indexed 
1,2 We will use the term facility to designate either the warehouse or a 
retailer. We make the following assumptions. 

• Each retailer faces a constant demand rate of D,; units, for i = 1,2 ,... ,n. 

• Set-up cost for an order at a facility is Ki , for i = 0, 1, , . . , n. 




6.3 A Single Warehouse Multi-Retailer Model 99 



• Holding cost is h' 0 at the warehouse and /*' at retailer i, with /*' > h' 0 for 
each i = 1, 2, . . . , n. 

• No shortages are allowed. 

As demonstrated by several researchers, policies for this problem may be quite 
complex and thus it is of interest to restrict our attention to a subset of all feasible 
policies. A popular subset of policies is the set of nested and stationary policies. A 
nested policy is characterized by having each retailer place an order whenever the 
warehouse does. As in the previous section, stationarity implies that reorder inter- 
vals are constant for each facility. It is easy to show that any policy should satisfy 
the Zero Inventory Ordering Property. Roundy (1985) showed that, although ap- 
pealing from a coordination point of view, nested policies may perform arbitrarily 
badly in one-warehouse multi-retailer systems. We therefore will not restrict our- 
selves to nested policies. We concentrate on policies where each retailer’s reorder 
intervals are a powers of two multiple of a base planning period Tb ■ Below, we 
assume the base planning period is fixed. The worst-case bound reduces to 1.02 if 
it can be chosen optimally, although we omit this extension. 

Let’s first determine the cost of an arbitrary power of two policy T = {To, Tf, . . . , 
T n } that satisfies the Zero Inventory Ordering Property. If we consider the inven- 
tory at the warehouse, then it does not have the saw-toothed pattern. To overcome 
this difficulty, it is convenient to introduce the notion of system inventory as well 
as echelon holding cost rates. Retailer V s system inventory is defined as the inven- 
tory at retailer i plus the inventory at the warehouse that is destined for retailer 
i. If we consider the system inventory of retailer i, then it has the saw-toothed 
pattern. Echelon holding cost rates are defined as h 0 = h' 0 and hi = h'i — h' 0 . For 
simplicity, define g, = \hiDi and g l = |/i 0 Dj for each * = 1, 2, . . . ,n. To compute 
the cost of such a policy, we separate each item in the warehouse’s inventory into 
categories depending on the retailer for which the item is destined. Let iLj(T 0 ,Tj ) 
be the average cost of holding inventory for retailer i at the warehouse and at 
retailer i. We claim: 



Hi (T 0 , Tj) = g{Ti + g l max{T 0 , T, ; }. 

To prove this consider the two cases: 

Case 1: Ti > Tq. Since T is a power of two policy, Tj > To implies that the 
warehouse places an order every time the retailer does. Therefore, the warehouse 
never holds inventory for retailer i and average holding cost is 

-h'iT-iDi = — ( hi + ho)TiDi = (<?j + g‘)Ti. 

Case 2: Ti < To. Consider the portion of the warehouse inventory that is destined 
for retailer i. Using the echelon holding cost rates, that is, inventory at retailer i 
is charged at a rate of hi and system inventory is charged at a rate of ho, we have 

Hi{To, Ti) = -hiDiTi + —hoDiTo = giTi + g l To- 
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Therefore, the average cost of a power of two policy T is given by: 

E§+E^( T 0’ T i)- ( 6 - 6 ) 

i> 0 1 i> 1 

Our objective then is to find the power of two policy T that minimizes (6.6). 
Our approach to solving this problem is to first minimize the average cost over 
all vectors T > 0, that is, we solve this problem when the restriction to power of 
two vectors is relaxed. We then round the solution T to a vector whose elements 
are the powers of two multiple of Tb- 

For a fixed value of To, we consider the following problem 

h(T 0 ) = hT { E + -Hi (To, Ti) }. (6.7) 

To solve this problem, let r' = g ^! g i and let n = \j~^r and note that t[ < t, 
for all z > 1. Then one can show that 

r 2 y/Kifa + g*) if T 0 < t[ 
h(T 0 ) = < + (gi + g^To if r' < T 0 < n 

{ 2^/KiQi + g l T 0 if Ti < T 0 . 

That is, if T 0 < r', it is best to choose T* = t'. If t[ < T 0 < r*, then choose 
T* = T 0 . If T 0 > r,:, it is best to choose T* = Ti. 

We now consider minimizing 

t/- n 

B(T 0 )= t^ + EW) 

T ° 

over all T 0 > 0. The function B is of the form 

EE + M (T 0 ) + H(T 0 )T 0 

over any interval where iv(), M() and H() are constant. For any To, define the 
sets G(T 0 ) = {i : T 0 < t/}, H(T 0 ) = {z : t[ < T 0 < t'} and T(T 0 ) = {z : Ti < T 0 }. 
Then K(), M() and H() are constant on those intervals where G(), E() and L() 
do not change. To find the minimum of H, consider the intervals induced by the 
2 n values t[ and t, for i = 1,2,..., n. Say To falls in some specific interval; then 
we set 

(t[ ifzGG(T 0 ) 

T* = \ T 0 if z G H(T 0 ) 

[ t, if zGT(To). 

The sets G, E and T change only when To crosses a breakpoint t[ or Ti for some 
i > 1. Specifically, if Tq moves from right to left across Ti, retailer z moves from 
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L to E. If Tq moves from right to left across r.(, retailer i moves from E to G. 
This suggests a simple algorithm to minimize B(T 0 ). Start with T 0 larger than the 
largest breakpoint, and let L = {1, 2, . . . , n} and G = E = 0. We then successively 
decrease To moving from interval to interval. On each interval we need only check 
that ^ ^ falls in the same subinterval as T 0 . In this case we set Tq = ^ |^°j . 

Let B* = -B(Tq) = inf 7 ’ 0 >o{T(To)}; then this value is clearly a lower bound on 
the cost of any power of two policy. 

We now want to prove that this value is a lower bound on the cost of any policy. 
For notational convenience, we abbreviate G = G(Tq), E = E(Tq) and L = T(Tq ). 
Let K = Kq + J2 ieE K i> G = + 9 l ) + E !G l 9 l and M = ^VKG- We also 

define for each i > 0 



G, = 



9i + 9 1 
9i, 

Ki 

( T o ) 2 ’ 



if ieG, 
if i G L, 
if i G EU {0}, 



G l = g l + gi — Gi, and Mi = 2\JK{Gi . In this way we can write B* 



as 



B* = M + M i- 

ieLUG 



( 6 . 8 ) 



We now prove that B* is a lower bound on any policy. We first show that in 
fact B* = Xw>o Mi- From (6.8), we need only show that M = X)j G eu{o} 

M = 2 VKG = 2 — 

rp * 

J- n 



= 2 > — 



iSEU{0} 



Ki 

T f* 

n 



= 2 



Ki 



iSEU{0} 



= 2 



ie£u{0} 



\J K/Gi 
\/KiGi 



E 

ie£u{o} 



M. h 



Consider any policy over an interval [0,t'] for t' > 0. We show that the total 
cost associated with this policy over [0,t'] is at least B*t' . Let to, be the number 
of orders placed by facility i > 0 in the interval [0,f']. Let hit) be the inventory 
at facility * > 1 at time t and let S) (t) be the system inventory of facility i > 1 at 
time t. Clearly, total inventory holding cost is 




102 



6. Economic Lot Size Models with Constant Demands 



We will show that this is no smaller than 



Y / (GiliW + GTSi 

i>i Jo v 



dt. 



For this purpose consider the quantity Gj7j(t) + G l Si(t) for each i > 1. There are 
three cases to consider. 

Case 1: i € G. Then Gj = g t + g l and G l = gi +g l — G,; = 0 and since S)(f) > I t (t) 
for all t > 0, we have 



9ili{t) + 9 i S i {t) > Gili(t ) + CPSitf). 

Case 2: i G L. Then Gi = gi and G 1 = gi + g l — Gi = g l : hence 
gJi(t) + g i Si(t) = GJi(t ) + G i S i (t). 

Case 3: i € E. Then Gj = ( t *) 2 anc ^ ^ 1 = 9i + g l — Gi . Observe that by definition 

if i € E, then r' < < n which implies gi < Gi < gi + g l . Since Si(t) > Ii(t ) for 

all f > 0, then 

9ili(t) + g l Si(t) = GJi(t) + G i S i (t) + (Gi - gi )(Si(t) - 7 4 (i)) 

>G i I i (t) + G i S i (t). (6.9) 

Finally, it is a simple exercise (see Exercise 6.7) to show that Go = X^>i 
and therefore our lower bound on the inventory holding cost can be written as 

V / (GJi(t) + G i Si(t))dt = V f Gili(t)dt, 

i> i Jo V J i>o Jo 



where we have defined 7 0 (t) = ^ X^i>i G l Si(t). 

Hence, total cost per unit of time under this policy is at least 



5Z ( R i m i + 

i> 0 




> ^ (7 C~+Gi—) (by Property 6.1.1) 

\ L 771* / 

i>0 1 

>2 ]T sjKiG i + 2 ]T s/KJdi 
ieLUG iesu{0} 

= ^2 M t = B*. 

i> 0 



We have thus established that B* is a lower bound on the total cost per unit time 
of any policy. 

Finally, for each i £ G U L select a power of two policy (a value of k) such that 



1 

71 



T* < T b 2 k < V2T*. 
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For each i G E U {0} select a power of two policy (a value of k) such that 

< T b 2 k < V2T*. 

v2 

It is a simple exercise (Exercise 6.4) to show that the policy constructed in this 
manner has cost at most 1.06 times the cost of the lower bound. 



6.4 Exercises 



Exercise 6.1. Consider the Economic Lot Size Model and let K be the set-up 
cost, h be the holding cost per item per unit of time and D the demand rate. 
Shortage is not allowed and the objective is to find an order quantity so as to 
minimize the long-run average cost. That is, the objective is to minimize 



C(Q) 



KD hQ 

~Q~ + T’ 



where Q is the order quantity. Suppose the warehouse can order only an integer 
multiple of q units. That is, the warehouse can order g, or 2 g, or 3 g, etc. 



(a) Prove that the optimal order quantity Q* has the following property. There 
exists an integer m such that Q* = mq and 

J?E± < % < [™+± t 

V m Q* V to 
where Q e , the Economic Order Quantity, is: 



Q e 



2KD 
h ' 



( b ) Suppose now that m > 2. Show that C(Q*) < 1.06 C(Q e ). 



Exercise 6.2. (Zavi, 1976) Consider the Economic Lot Size Model with infinite 
horizon and deterministic demand D items per unit of time. When the inventory 
level is zero, production of Q items starts at a rate of P items per unit of time, 
P > D. The set-up cost is K$ and holding cost is /i$/item/time. Every time 
production starts at a level of P items/time, we incur a cost of aP, a > 0. 

(a) What is the optimal production rate? 

( b ) Suppose that due to technological constraints, P must satisfy 2D < P < 3D. 

What is the optimal production rate and the optimal order quantity? 
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Exercise 6.3. Consider the Economic Lot Size Model over the infinite horizon. 
Assume that when an order of size Q is placed the items are delivered by trucks of 
capacity q and thus the number of trucks used to deliver Q is [^] , where [to] is 
the smallest integer greater than or equal to to. The set-up cost is a linear function 
of the number of trucks used: it is Kq + \^]K. Holding cost is h $/item/time and 
shortage is not allowed. What is the optimal reorder quantity? 

Exercise 6.4. Prove that the heuristic for the Single Warehouse Multi-Retailer 
Model described in Section 6.3 provides a solution within 1.06 of the lower bound. 

Exercise 6.5. Consider the power of two policies described in the single product 
model of Section 6.1.3. Describe how you could generate a power of three policy 
(a policy where each T) = 3 k Tg for some integer k > 0). What is the effectiveness 
(in terms of worst-case performance) of the best power of three policy? 

Exercise 6.6. (Porteus, 1985) The Japanese concept of JIT (Just In Time) advo- 
cates reducing set up cost as much as possible. To analyze this concept, consider 
the Economic Lot Size model with constant demand of D items per year, holding 
cost h $ per item per year and current set up cost K 0 . Suppose you can lease 
a new technology that allows you to reduce the set up cost from K 0 to K at an 
annual leasing cost of A — Bln(K) dollars. That is, reducing the set up cost from 
the current set up cost,Ao, to K will cost annually A — Bln(K) dollars. Of course, 
we assume that A — Hln(ATo) = 0 which implies that using the current set up cost 
requires no leasing cost. What is the optimal set up cost? What is the optimal 
order quantity in this case? 

Exercise 6.7. Show that in the proof of the lower bound, B*, for the single 
warehouse multi-retailer model we have Go = JA>i G*. 



Exercise 6.8. Prove equation (6.9). 
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Economic Lot Size Models with Varying 
Demands 



Our analysis of inventory models so far has focused on situations where demand 
was both known in advance and constant over time. We now relax this latter 
assumption and turn our attention to systems where demand is known in advance, 
yet varies with time. This is possible, for example, if orders have been placed 
in advance, or contracts have been signed specifying deliveries for the next few 
months. In this case, a planning horizon is defined as those periods where demand 
is known. Our objective is to identify optimal inventory policies for single item 
models as well as heuristics for the multi-item case. 



7.1 The Wagner- Whitin Model 

Assume we must plan a sequence of orders, or production batches, over a T period 
planning horizon. In each period, a single decision must be made: the size of the 
order or production batch. 

We make the following assumptions. 

• Demand during period t is known and is denoted dt > 0. 

• The per unit order cost is c and a fixed order cost K is incurred every time 
an order is placed; that is, if y units are ordered, the order cost is cy + K6(y) 
(where 8(y) = 1 if y > 0, and 0 otherwise). 

• The holding cost is h > 0 per unit per period. 

• Initial inventory is zero. 
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• Leadtimes are zero; that is, an order arrives as soon as it is placed. 

• All ordering and demand occurs at the start of the period. Inventory is 
charged on the amount on hand at the end of the period. 

The problem is to decide how much to order in each period so that demands 
are met without backlogging and the total cost, including the cost of ordering and 
holding inventory, is minimized. This basic model was first analyzed by Wagner 
and Wlritin (1958) and has now been called the Wagner- Whitin Model. 

In this model, it is clear that the total variable order cost incurred will be fixed 
and independent of the schedule of orders, and thus this cost can be ignored. Let 
y t be the amount ordered in period t, and I t be the amount of product in inventory 
at the end of period t. Using these variables, the problem can be formulated as 
follows: 



T 

Min ^2 + hl t 

t = i 

It = It.-i + yt — dt, t = (7.1) 

Io = 0 (7.2) 

It,yt> 0, t = 1,2, ... ,T. (7.3) 

Here constraints (7.1) are called the inventory-balance constraints, while (7.2) sim- 
ply specifies initial inventory. Note that the inventory can also be rewritten as: 
It = (Vj — di) and therefore the It variables can be eliminated from the 

formulation. 

Wagner and Whitin made the following important observation. 

Theorem 7.1.1 Any optimal policy is a zero-inventory ordering policy, that is, a 
policy in which 

Vth- 1 = 0, for t = 1, 2, . . . , T. 

Proof. The proof is quite simple. By contradiction, assume there is an optimal 
policy in which an order is placed in period t even though the inventory level at 
the beginning of the period (It-i) is positive. We will demonstrate the existence 
of another policy with lower total cost. Evidently, the It - 1 items of inventory were 
ordered in various periods prior to t. Thus, if we instead order these items in period 
t, we save all the holding cost incurred from the time they were each ordered. I 
Thus, ordering only occurs when inventory is zero. A simple corollary is that 
in an optimal policy an order is of size equal to satisfy demands for an integer 
number of subsequent periods. 

Using the above property, Wagner and Whitin developed a dynamic program- 
ming algorithm to determine those periods when ordering takes place. By con- 
structing a simple acyclic network with nodes V = {1,2 , ,T +1}, we can view 
the problem of determining a policy as a shortest path problem. Formally, let £,j > 



Problem WW : 
s.t. 
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the length of arc (i,j) in this network, be the cost of ordering in period i to satisfy 
the demands in periods i, i + 1, . . . , j — 1, for all 1 < i < j < T + 1. That is, 

i 1 

lij = K + h^^(k — i)dk . 

k=i 

All other arcs have tij = + 00 . The length of the shortest path from node 1 to 
node T + 1 in this acyclic network is the minimal cost of satisfying the demands 
for periods 1 through T. The optimal policy, that is, a specification of the periods 
in which an order is placed, can be easily reconstructed from the shortest path 
itself. This procedure is clearly 0(T 2 ). 

Most of the assumptions made above can be relaxed without changing the basic 
solution methodology. For example, one can consider problem data that are period 
dependent (e.g., c t , h t or K t ). The assumption of zero leadtimes can be relaxed if 
one assumes the leadtimes are known in advance and deterministic. In that case, 
if an order is required in period t, then it is ordered in period t — L, where L is 
the leadtime. 

Researchers have also considered order costs that are general concave functions 
of the amount ordered, that is, ct(y). The problem can be formulated as a network 
flow problem with concave arc costs. This was the approach of Zangwill (1966) 
who also extended the model to handle backlogging, although the solution method 
is only computationally attractive for small size problems. 

The Wagner- Whitin model can also be useful if demands during periods well 
into the future are not known. This idea is embodied in the following theorem. 

Theorem 7.1.2 Let t be the last period a set-up occurs in the optimal order policy 
associated with a T period problem. Then for any problem of length T* > T it is 
necessary to consider only periods {j : t < j < T*} as candidates for the last 
set-up. Furthermore, if t = T, the optimal solution to a T* period problem has 
Vt > 0. 

This result is useful since it shows that if an order is placed in period t, the optimal 
policy for periods 1, 2, . . . , t — 1 does not depend on demands beyond period t. 

Surprisingly, even though the Wagner- Whitin solution procedure is extremely 
efficient, often simple approximate, yet intuitive, heuristics may be more appeal- 
ing to managers. For example, this may be the reason for the popularity of the 
Silver-Meal (1973) heuristic or the Part-Period Balancing heuristic of Dematteis 
(1968). One important reason is the sensitivity of the optimal strategy to changes 
in forecasted demands d t , t = 1,2, ... ,T. Indeed, in practice these forecasted de- 
mands are typically modified “on-the-fly.” These changes typically imply changes 
in the optimal strategy. Some of the previously mentioned heuristics are not as 
sensitive to these changes while producing optimal or near optimal strategies. For 
another approach, see Federgruen and Tzur (1991). 

Recently researchers have shown that it is possible to take advantage of the 
special cost structure in the Wagner- Whitin model and use it to develop faster 
exact algorithms (i.e. , 0(T)). This includes the work of Aggarwal and Park (1990), 




108 



7. Economic Lot Size Models with Varying Demands 



Federgruen and Tzur (1991) and Wagelmans et al. (1992). 

We sketch here the 0(T) algorithm of Wagelmans et al. which is the most 
intuitive of the ones proposed. It is a backwards dynamic programming approach. 
Define dij = dt for hj = 1,2, that is, the demand from period i to 

period j. To describe the algorithm, we will change slightly the way we account 
for the holding cost. If an item is ordered in period i to satisfy a demand in period 
j > i, then we are charged Hi = (T — i + l)h per unit. That is, we incur the holding 
cost until the end of the time horizon. As long as we remember to subtract the 
constant h , du from our final cost, then we are charged exactly the right 
amount. With this in mind, define G(i ) to be cost of an optimal solution with a 
planning horizon from period i to period T, for i = 1,2, ... ,T. For convenience, 
define G{T + 1) = 0. Then, 

G(i) = min {K + Hidi t-\ + G(t) } 

i<t<T + 1 

= K+ min {Hid iit -i + G(t)}. (7.4) 

i<t<T + 1 

The final cost is then G(l) — Using this recursion, which is just a 

reformulation of the shortest path recursion discussed earlier, it is clear that the 
complexity is 0(T 2 ). Wagelmans et al.’s 0{T ) algorithm is based on the crucial 
observation that with careful implementation, the total amount of time spent 
finding the period that minimizes (7.4) over the entire running of the algorithm is 
0(T). 

Consider the calculation of G(i). It is useful to plot the points ( djr,G(j )) for 
j = i+ 1, i+2, . . . ,T+ 1, where the point ( dr+i,T , G(T+1)) is simply the origin. Let 
£ be the lower convex envelope of these points; then define the function g{ x) = y 
if and only if ( x,y ) € f. It is clear that g is a piecewise linear convex function on 
[0, dj+i.r] with <7(d,+i,r) = G(i + 1) and g(0) = 0. See Figure 7.1. 

Define the breakpoints of g to be all the points x where g changes slope in 
addition to the points x = 0 and x = di+i t x- If a: is a breakpoint, then x = djT 
for some period j € {i + 1, i + 2, . . . , T + 1}. Let there be r breakpoints and let 
i + 1 = t(l) < t( 2) < . . . < t(r) = T + 1 denote the corresponding periods. These 
periods are called efficient because of the following. 

Theorem 7.1.3 

+ G(t)} = ^min {Hdi^p)-! + G(t(p))}. 



Proof. Suppose that j (with i + 1 < j < T + 1) is not an efficient period and let k 
and £ (with k < j < £) be the two consecutive efficient periods straddling j. The 
slope of g on [. d(r,dkr ] is equal to [G(k) — G(£)]/dk,e- i, hence 



g{d jT ) = G(£) + 



G(k) - G(£) 
dux - i 



dj,e- 1 - 
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FIGURE 7.1. The plotted points and the function g. 



Furthermore, G(j) > g(djr). 

There are two cases to consider. 

Case 1: Hi > G( ^~ GW . Then 

1 — dk,e — i 

Hidij- 1 + G(j) > Hidi t k- i + Hidkj- i + g{djffi 



> + + G(f) + 

akJ-i dki-i 



— Hidi t k - i + G(k). 

Case 2: H < G( ^~ GW . Then 
Hid-ij-i + G{j) > Hidi t i— i — Hidj/-i + g(djx) 



> + G(0 + 



dk,e-i 



dk,e- i 



— Hidi^i - 1 + G(f'). 

In both cases, the minimum occurs at an efficient period. I 

Being able to quickly find the efficient period p that achieves the minimum is 
therefore crucial to the complexity of the algorithm. This step is aided by the 
following result. 



Lemma 7.1.4 Let k and £, k < £ be two consecutive efficient periods. If 

G(k) - G{£) 



dki - 1 



< Hi. 




no 
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then 

Hidi'k — 1 + G(k) < + G(£); 

otherwise 

Hidi'k — 1 + G(k) > Hidifi — i + G(£). 

Proof. Suppose that G ^7*^ < then G(fc) < Hidk.e - 1 + G(£). Adding 
Hidi t k - 1 to both sides results in Hid it k~i + G(fe) < Hid^t - i + G(£). The other 
case can be shown in a similar fashion. I 

We now describe specifically how to find the efficient period achieving the min- 
imum in (7.4). This is done by keeping an up-to-date list L of the current efficient 
periods. Let £(p) be the index of the efficient period immediately following efficient 
period p; that is, p < £{'p). From Lemma 7.1.4 and the convexity of g it follows 
that the value of j that achieves the minimum of 



min {Hidij - 1 + G(j)} 
i<j<T + 1 



corresponds to the period q(i) defined by: 

q(i) = min T + 1, min ^pGL\p<T+l and 
because then 



G(p) - G(£(p)) 
i 



< H. 



•}] 



Hidi^p - 1 + G(p) > 1 1 i<lij(p) -i + G(t(p)), for p <E L and p < q(i), 

and 

Hidi , p - 1 + G(p) < Hidij(p )_ i + G(£(p)), for p e L and p > q{i). 

In fact, it is easy to determine q(i) from q(i + 1). Note that q(i + 1) G L and 
as long as q(i + 1) is efficient it has the same successor l(i + 1) in L. Using the 
definition of q{i + 1) we obtain: 

G(g(* + 1))-G(f(g(*+1))) ^ 

, < TJ,:+i S 

“g(i+l)/(g(i+l))-l 

Hence, it follows that q(i) < q(i + 1); that is, the values of q(i) are decreasing in 
i. Therefore, starting at q(i + 1) we successively decrement by one until we find 
q(i). The total amount of time spent searching for q(i) in the entire algorithm is 
therefore O(T). 

To complete the complexity result, we must be able to quickly update the list of 
efficient periods, that is, update the lower convex envelope. After calculating G(i) 
and plotting the point ( diT,G(i )), we search for the smallest efficient period t(s) 
such that the slope of the line segment connecting ( diT,G(i )) to ( d t ( s ),T,G(t(s ))) 
is greater than the slope of the line segment connecting (d t ( s +i),T, G(t(s + 1))) to 
( dt( s ),T,G(t(s ))) (thus maintaining convexity). Then the new efficient periods are 
i and the periods from t(s) to t(r) = T+ 1; the efficient periods between i + 1 and 
f(s) — 1 become inefficient. Since a period can become inefficient at most once, one 
can verify that the total amount of work spent updating the list L over the entire 
algorithm is 0(T). 
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7.2 Models with Capacity Constraints 

An important generalization of the Wagner- Whitin model is the inclusion of upper 
bounds on the amount that can be ordered or produced in a given period. This 
corresponds to adding the following constraints to Problem WW. 

yt<C t , t = 1,2, ... ,T. (7.5) 

The values Ct > 0 correspond to the maximum amount that can be ordered (or 
produced) in period t. due to, for example, limited production capacities. 

In this case, the problem is not as simple as before; Florian et al. (1980) show 
that in general, the problem is AfP-Complete . Florian and Klein (1971) propose 
a dynamic programming approach which involves solving a sequence of acyclic 
shortest path problems for the special case where C t = C for all t. Love (1973) de- 
vises an algorithm based on characterizing the extreme points of the solution space 
for the general problem. The branch and bound algorithm of Baker et al. (1978) 
seems to be the most computationally effective, although it is not polynomial. 

We sketch here the approach of Florian and Klein. For now assume unequal 
capacities; most of the structural results proved by Florian and Klein hold in this 
more general case. Clearly, a feasible solution exists if and only if 

i i 

J2 C t^J2 d P for i = 1, 2, . . . , T. 
i= i i= i 

We therefore assume this is satisfied. Let 

V = {y € 1R t : y satisfies (7.1), (7.2), (7.3) and (7.5)}, 

and let D be the set of extreme points of V . Since the objective function is concave 
(why?), we know an optimal solution will exist in D. 

Florian and Klein prove the following Inventory Decomposition Property. 

Theorem 7.2.1 Suppose that the constraint 

Ik = 0, for some k € [1, . . . , T — 1] 

is added to Problem WW and 

i i 

^2 C 3- d p for i = k + 1, ■ ■ ■ ,T 

j=k-\-l j=k + 1 

holds. Then an optimal solution to the original problem can be found by indepen- 
dently finding solutions to the problems for the first k periods and for the last T—k 
periods. 

This is clearly a generalization of Theorem 7.1.2. Following this idea, call a 
period t a regeneration point if It = 0. Define a production sequence Sij, where 
0 < i < j < T, to be: 

Sij = {{y i+1 ,y i+2 , ■•■,%) | h = Ij = 0, I k > 0 for i < k < j}. 
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Clearly, any production plan can be decomposed into a set of production sequences. 
Define a production sequence Sjj to be capacity constrained if the production level 
in at most one period k, (i + 1 < k < j) satisfies 0 < y k < C k and all other 
production levels are either zero or at their capacities. 

The authors then characterize the extreme points of V in the following way. 

Theorem 7.2.2 

y £ D •<=>■ y consists of capacity constrained production sequences only. 

This characterization is done in several steps. First: 

Lemma 7.2.3 If y £ D, then y consists only of capacity constrained production 
sequences. 



Proof. Suppose y £ D and S^ is a production sequence of y that is not capacity 
constrained. This means there are at least two periods, say k and t, [i + 1 < k < 
£ < j), in which 0 < y k < C k and 0 < ye < Ce- Without loss of generality we can 
assume there are only two periods of this type. 

Let 1 

5 = - min {y k , C k - y k , ye, Ce - ye , min I t }, 

2 i+l<t<j 

and let e n be the (j — i) component vector with a one in the n th position and zeros 
everywhere else. Define two production sequences 



and 



id'ij — Sij 5e k —i T 5ee~i, 

fy j Sij T 5e k ~i See—i- 



Note that production sequence S f simply represents a shifting of production from 
period k to period £, while sequence S'" represents the opposite shift. They are 
clearly feasible, and since 5 > 0 they are distinct. However, S,; ? = \ (Sf + S"), a 
contradiction. I 



Lemma 7.2.4 If y' and y" are distinct feasible production plans and y = \(y' + 
y"), then y' and y" share all the regeneration points of y. 



Proof. Let period k be a regeneration point of y. Then 



K 1 K K 

0 = ^2(yt -dt) = ^ 5 Z^y't ~ - d t ) 



t= l 



t = l 



t=i 



= V* 



+ I'k) ■ 



Since I' k , I'f > 0, both I' k and I" must be zero. I 

Lemma 7.2.5 If a feasible plan y consists only of capacity constrained production 
sequences, then y £ D. 
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Proof. Suppose by contradiction that y qL D. Then there exist feasible plans y' 
and y" such that y = \{y' + y"). 

From Lemma 7.2.4, y' and y" share the regeneration points of y. Let i and j be 
two such successive regeneration points, and let Sij, S' tJ and S'," be the associated 
distinct production sequences of y, y' and y " , respectively. Evidently, 

Sij = liS'ij + S''). 

We show that the only possibility is Sy = SC = S"-. For this purpose, consider 
any period k, i + 1 < k < j and observe that yk can take only three possible values. 
Either y k = 0 in which case y' k = y" = 0, or y k = C k in which case y' k = y' k = C k 
or 0 < yk < Ck ■ Since S-y is a capacity constrained sequence, at most one period, 
say period £ , i + 1 < £ < j has 0 < ye < C(. But total production between period 
i + 1 and period j must be equal to total demands over the same periods, and 
hence ye = y'e = y”. Consequently, Sy = SC = S," . I 

This completes the proof of Theorem 7.2.2. 

It is now clear that an optimal solution must be made up of a sequence of 
optimal capacity constrained production sequences. However, determining these 
sequences can be quite tedious and computationally expensive. To make the prob- 
lem tractable, Florian and Klein consider the case where the capacity constraints 
are identical and equal to C. Demand between any two periods, say periods i and 
j, can then be written as mC + p where m is an integer and p < C. Then: 

Corollary 7.2.6 If C t = C for allt, an optimal production sequence has a number 
of periods in which production levels are equal to C, at most one period where 
production level is 0 < p < C, and the remaining periods have zero production 
levels. 

This simplifies the problem considerably; for example, consider determining the 
optimal production sequence between regeneration points i and j. From Corollary 
7.2.6, in each period k € {i + 1, * + 2, . . . , j} production is either 0, C or p for 
some p £ (0, C). Let Y k = Y^,e=i+ 1 ^ or * < & < j > that is, the amount produced 
between periods i+1 and k in this production sequence. Then Y k can only take 
on values in {0,p, C,C + p, 2 C, . . . , mC , mC + p}. 

Thus, we can construct a network where the vertices correspond to the possible 
values of Yk for each i < k < j with directed edges (Y k . Tfe +1 ) defined by: 

• If Yk = £C, l = 0, 1, . . . , m, then there are three edges emanating from this 
vertex: one to Yk+i = IC (corresponding to no production in period fc), one 
to Yfc+i = £C + p (corresponding to production of p in period k) and one to 
Yfc+i = (£ + 1)C (corresponding to production of C in period k). 

• If Yk = £C +p, £ = 0, 1, . . . , to, then there are two edges emanating from this 
vertex: one to Yfc+i = £C + p (corresponding to no production in period k) 
and one to Y k +\ = (£+ 1)C + p (corresponding to production of C in period 
k). 
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After creating an artificial initial vertex Yq, we see that every path from Yq to Yj 
represents a feasible capacity constrained production sequence. Assigning arc costs 
equal to the cost of producing and storing the corresponding product amounts, 
it is clear that finding the optimal production sequence from i to j is no harder 
than solving the shortest path problem on this network. The complexity of this 
procedure is clearly proportional to ( j — i) , thus determining that the optimal 
production sequence between all pairs of periods is 0(T 4 ). 

To determine the optimal production plan over the entire planning horizon, 
Florian and Klein solve another shortest path problem on a network similar to the 
one formulated in Section 7.1. That is, length of an arc (i, j) in this network is 
the total cost of the optimal production sequence from i to j. After solving the 
shortest path problem, the optimal set of regeneration points can be found by 
checking the shortest path. This step is 0(T 2 ). 
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In many practical situations, the coordination of inventory and ordering policies 
involves a variety of different products and this complicates the problem consider- 
ably. Consider the uncapacitated case once again, and assume there are n products. 
Each product faces a known demand during the next T periods. In addition, a fixed 
order cost of Kj is incurred every time product i is ordered. 

For each product i , define the following. 

• Let Hit be the amount of product i ordered in period t, for t = 1, 2, . . . , T. 

• Let hi be the inventory holding cost for product i. 

• Let la be the amount of product i in inventory at the start of period t, for 
t = 1,2, . . . ,T. 

• Let da be the demand in period t for product i, for t = 1, 2, , T. 

Making the same assumptions as in the Wagner- Whitin model, the problem is 
then: 



T n 

Problem P : Min ££[ K t 5(VU J + hilit 

t=l 1 

s.t. 



hit — hi,t— 1 T Hit dit, i — 1,2,..., 77 . , t 1,2, ... ,T 

ha — o, i — 1,2 , ... ,n 

ht,yu> 0, i = 1,2, . . . ,n, t = 1,2, . . . ,T . 



(7.6) 

(7.7) 

(7.8) 



Here (7.6) are inventory-balance constraints for each product, while (7.7) specify 
starting inventory for each product. 
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It is easy to see that P decomposes into m single product problems. Each of these 
single product problems can be solved using the algorithms for the Wagner- Whitin 
model. 

A more realistic version of this problem is when a joint set-up cost I\q is present. 
This cost is incurred whenever any product is ordered. The problem then becomes 

T n m 

Problem P' : Min E [ a ^(E y it) + E 

t= 1 i- 1 i= 1 

s.t. (7.6), (7.7) and (7.8). 

Unfortunately, this problem is considerably more difficult to solve than the 
simple Wagner- Whitin model. In fact, Arkin et al. (1989) prove that it is MV- 
Complete. Several researchers have proposed heuristics for this problem, including 
Silver (1976), Atkins and Iyogun (1988) and Joneja (1990). We present here the 
approach of Joneja. 

The cost covering heuristic of Joneja proceeds period by period in a forward 
direction. Specifically, at period t, the ordering policy of periods 1, 2, . . . , t — 1 has 
been determined and the decision is which items to order, if any, in period t. Let ti 
be the last period in which item i was ordered. Let Hj t denote the total inventory 
holding cost incurred by item i since period tj assuming no order for item i is 
placed in period t. That is, 



t 

Ha = hi 'y ' (j ti)dij. 
j=U + 1 

Intuitively, if we forget for the moment, the joint order cost and Hu > Ki, then 
it is worth ordering item i in period t, since it costs more to keep an item in 
inventory from period f, (the last time item i was ordered) to t than to order it 
in period t. The quantity ma x{Hu — ivj,0} can be seen as the savings that are 
accrued by ordering item i in period t. This approach is basically the Silver-Meal 
heuristic adapted to the multiple item case. With the joint order cost present, an 
order should only be placed if the total savings accrued by ordering a set of items 
in period t exceeds the joint order cost. Therefore, Joneja proposes the following 
ordering rule. 

Rule 1. In period t, order those items i such that Hu > Ki, if ^]"_i ma x{iL, t — 
Ki, 0} > Kq. 

Joneja shows that this single rule is not quite strong enough to ensure that the 
schedule of orders is cost efficient. For instance, consider the following example 
with two products. The holding costs are equal (hi = h 2 = 1). Pick an integer m 
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and set the demands to 

d\ t = 0, for t = 1, 2, . . . , m — 1 
, K 0 + K x 

«1 m — : — 

TO — 1 

d< 2 t = 0, for t = 1, 2, . . . , to 
, K 0 + A 2 

M2,m+1 — • 

TO 

Using Rule 1, item 1 will be ordered at time to, but not item 2. Item 2 will 
be ordered at time to + 1. If both items were ordered at time to, then we pay 
^ 2 ^ 2 , m+i = A °,^ A2 in extra holding cost but save K 0 in ordering costs. Therefore, 
for large m, we see that we can be far from optimal. 

To counteract this behavior, Joneja proposed the following additional feature. 
Let to be the time at which the last joint order was placed, and assume item i was 
not included in this order (since Hu 0 < Ki). It may, in some cases, be advantageous 
to order item i at time to even though Rule 1 would specify the opposite. Define 

t 

Sit “ hiito ti ) y ' dij . 

j—to 

Then S, tt is the savings in inventory holding cost accrued by ordering item i at 
time to- Since a joint order is already placed in period to, the following rule was 
proposed. 

Rule 2. In period t , if the last joint order was in period to, item i was not ordered 
in period to and Su > K , , then order item i in period to- 

Computational experiments with this heuristic, whose complexity is 0(nT), 
show that it produces solutions fairly close to optimal. 

7.4 Exercises 



Exercise 7.1. Assume order costs are general concave and time-dependent func- 
tions of the number of items produced. Also, assume holding costs are general 
concave and time-dependent functions of the number of items held in inventory. 
Prove that the Zero-Inventory Ordering Property holds in this general setting as 
well. 

Exercise 7.2. The Silver-Meal Heuristic works as follows. Let d\,d 2 , ■ ■ ■ r d n be 
the demands in the n period planning horizon. Define C(T) to be the per period 
average holding and set-up cost under the condition that the current order covers 
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demand in the next T periods. Then C( 1) = K, C( 2) = \ (K + /id 2 )> etc. In the 
Silver-Meal Heuristic we calculate these until C(i) > C(i — 1). In this case, we 
stop and produce in period 1 to meet the demand of the first i — 1 periods. We 
then start over with the I th period. 

Construct an example where the Silver-Meal Heuristic provides a nonoptimal 
solution. 




8 

Stochastic Inventory Models 



8.1 Introduction 

The inventory models considered so far are all deterministic in nature; demand 
is assumed to be known and either constant over the infinite horizon or varying 
over a finite horizon. In many logistics systems, however, such assumptions are not 
appropriate. Typically, demand is a random variable whose distribution may be 
known. 

Stochastic inventory models have attracted considerable attention in the last 
three decades. The pioneering work of Arrow, Harris and Marsclrak (1951), Scarf 
(1960), Iglehart (1963a and b) and Veinott and Wagner (1965) for a single ware- 
house, Clark and Scarf (1960) for multi-echelon systems, Eppen and Sclrrage (1981) 
and Federgruen and Zipkin (1984a-c) for distribution systems, and Rosling (1989) 
for assembly systems, all represent milestones in our understanding of complex 
stochastic logistics systems. More recently, the work of Zheng (1991), Zheng and 
Federgruen (1991) and Chen and Zheng (1994) reveal new insights and provide 
more efficient algorithms for these problems. For recent reviews, we refer the reader 
to Lee and Nahmias (1993), Porteus (1990) and the recent book by Zipkin (2000). 

In this chapter we review some of the main results in stochastic inventory mod- 
els. We start with the analysis of a single warehouse model. To build our intuition, 
Section 8.2 considers a single period model. In Sections 8.3 and 8.4 we show that 
the insight obtained in the previous section can be used to analyze a multi-period 
model. Section 8.5 extends the analysis further to the infinite horizon model. Fi- 
nally, Section 8.6 describes the development of interesting bounds on the optimal 
cost for multi-echelon systems. 
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8.2 Single Period Models 

8.2.1 The Model 

Consider a risk- neutral company that designs, produces and sells winter fashion 
items such as skijackets, coats, etc. About six months before the winter season, the 
company must commit itself to specific production quantities for all its products. 
Since there is no clear indication as to how the market will respond to the new 
designs, these decisions are typically based on realized sales from the last few years, 
current economic conditions and professional judgment. 

To assist management in selecting production quantities, the marketing depart- 
ment assumes that demand D for each new product is randomly distributed, gen- 
erated from a product-specific distribution with continuous cdf F(-). Additional 
information available to the decision makers includes the variable production cost 
per unit c, the selling price per unit r, and the salvage value per unit v. Clearly, 
these variables should satisfy r > c > v, otherwise the problem can trivially be 
solved. 

Since demand is a random variable, the decision concerning how many units 
to produce is based on the expected cost z(y ), which is a function of the amount 
produced y. This expected cost is 

z(y) = cy — rE[min(y, D)\ — ui?[max(0, y — D )] for y > 0, 

where E(-) denotes the expectation. Note that i?[min(y, D)\ = /j’ y DdF(D) + 
y JJ° dF(D). Adding and subtracting the quantity r f^L y DdF(D) to z(y ), we get 

,-oo ry 

z{y ) = cy- rE[D } -r (y - D)dF(D) ~v (y - D)dF(D). (8.1) 
J D—y J D = 0 

The objective is, of course, to choose y so as to minimize the expected cost z(y). 
This is the so-called newsboy problem or newsvendor problem. 

Taking the derivative of z{y) with respect to y and using the Leibnitz rule, we 
get the first order optimality condition: 

c — r(l — Pr {D < y}) — v Pr {D < y} = 0, 

which implies that the optimal production quantity S should satisfy 

Pr {D <S}= 7 —^~. 

r — v 

Since by assumption, r — c < r — v and F(D) is continuous, a finite value S, S > 0 
always exists. In addition, it can easily be verified that the expected cost z(y) is 
convex for y £ (0,oo), and that the value of z{y) tends to infinity as y — » oo. 
Hence, the quantity S' is a minimizer of z(y). 

Observe that, implicitly, three assumptions have been made in the above anal- 
ysis. First, there is no initial inventory. Second, there is no fixed set-up cost for 
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starting production. Third, the excess demand is lost; that is, if the demand D 
happens to be greater than the produced quantity y, then the additional revenue 
r{D — y) is lost. 

The tools developed so far allow us to extend the above results to models with 
initial inventory yo, and set-up cost I\. We now relax the first two assumptions. 
Observe that the expected cost of producing (y — yo) units is 

K - cy 0 + z(y). 

Hence, S clearly minimizes this expected cost if we decide to produce. Conse- 
quently, there are two cases to consider. 

1- If Vo > S, we should not produce anything. 

2. If y 0 < S, the best we can do is to raise the inventory to level S. However, 
this is optimal only if — cyo + z(yo), the cost associated with not producing 
anything, is larger than or equals K — cyo + z(S), the cost associated with 
producing S — yo. That is, if yo < S, it is optimal to produce S — yo only if 
z(yo) > K + z(S). 

Let s be a number such that 



z(s) = K + z(S). 

The discussion above implies that the optimal policy has the following structure. 

Order S — yo if the initial inventory level yo is at or below s, otherwise do not order. 

We refer to such a policy as an (s, S) policy. The quantity S is called the order- 
up-to level while s is referred to as the reorder point. In the special case with zero 
fixed ordering cost, we have s = S and the policy reduces to a base stock policy: 
when the initial inventory level is no more than S, make an order to raise the 
inventory level to S; otherwise no order is placed. 



8.3 Finite Horizon Models 

8.3.1 Model Description 

We are now ready to consider the finite horizon (multi-period) inventory problem. 
This problem can be described as follows. At the beginning of each period, for ex- 
ample, each week or every month, the inventory of a certain item at the warehouse 
is reviewed and the inventory level is noted. Then an order may be placed to raise 
the inventory level up to a certain level. Replenishment orders arrive instantly. 
The case with the nonzero lead time will be discussed at the end of the Section 
8.5. 

We assume that demands for successive periods are independent and identically 
distributed. If the demand exceeds the inventory on hand, then the additional 
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demand is backlogged and is filled when additional inventory becomes available. 
Thus, the backlogged units are viewed as negative inventory. The inventory left 
over at the end of the final period has a value of c per unit, and all unfilled 
demand at this time can be backlogged at the same cost c. As we shall see, these 
assumptions ensure that the expected (gross) revenue in each period is a constant, 
and therefore we will not include the revenue term in our formulation. 

Costs include ordering, holding and shortage costs. Ordering cost consists of a 
set-up cost, K , charged every time the warehouse places a replenishment order, 
and a proportional purchase cost c. There is a holding cost of h + for each unit 
of the inventory on hand at the end of a period and a shortage cost of h~ per 
unit whenever demand exceeds the inventory on hand. To avoid triviality, we 
assume h~ , h + > 0 (why?). The objective is to determine an inventory policy that 
minimizes the expected cost over T periods. In what follows, we show that an 
(st, St) policy is optimal. Of course, an (s t , St) policy is similar to the (s, S) policy 
described earlier except that the parameters s and S may vary from period to 
period. 

To characterize the optimal policy for the finite horizon model we first develop 
a dynamic programming formulation of the problem. Let y t be the inventory level 
at the beginning of period t (before possible ordering). 

If the inventory level immediately after ordering is y, then the expected one- 
period shortage and holding cost for that period is 

G(y) = h + f max(j/ — D, O)dF(D) + h~ f max(D — y, O)dF(D), (8.2) 
J d J D 

which is the so-called one-period loss function. Since the maximum of convex 
functions is convex and convexity is preserved under integration, we see that G{y) 
is convex. 

Given a policy Y = (y 1 , y 2 , ■ ■ ■ , y T ), where y l are the order-up-to levels (random 
variables) of period t and may be contingent upon other variables, the sum of the 
total expected proportional purchasing cost and salvage value is given by 

T 

= E [ c ( yt ~ y *> ~ c ^ yT ~ Dt ^> ’ 
t= 1 

where Dt is the realized demand in period t. Noting that yt+i = y l — Dt, we have 

Pj2 = cE \ v 1 - 2/1 + y 2 - {y 1 - D i) h 1 - y T - ( y T ~ 1 - l > t - i ) + d t - y T } 

= cTE[D\. 

Thus, Py-' is independent of the ordering policy, and we can drop off the linear 
ordering cost component from the formulation. This observation is quite intuitive, 
since all backlogged demand is filled at the end of the last period while all remaining 
inventory left at this period is salvaged, both at the same price c. We also remark 
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that whenever possible, we will suppress the subscript t from D t (because demands 
are iid) and superscript t from y l . 

To formulate the dynamic program, define the following two expected cost func- 
tions. Recall that yt. is the inventory level, prior to ordering, at the beginning of 
period t. Let G*(yt) be the expected cost for the remaining T—t+1 periods if we do 
not order in period t and act optimally in the remaining T — t periods. Let z t (yi) 
be the minimal expected cost incurred through the remaining T — t + 1 periods if 
we act optimally in period t and all the remaining T — t periods. It follows that for 
t — 1; 2, . . . , T, 

G t {y t ) = G(y t ) + [ z t+1 (y t - D)dF(D), 

J D 

and 

z\yt) = Min y > yt {KS(y - y t ) + G t (y)}, (8.3) 

where z T+1 (y ) = 0 for any y , and S(x) is 1 if x > 0 and it is 0 otherwise. 

Note that if we order up to the level y > y t in period t, the cost for the final 
T — t+1 periods is K + G t (y). 

Notice that the functions G t {y) and z t (y) are not convex and may even have 
many local minima. In order to show that an (s, S) policy is optimal for this 
model, we employ the concept of TT-convexity, which provides us a powerful tool 
to analyze stochastic inventory models with fixed ordering cost. 

8.3.2 K -Convex Functions 

Definition 8.3.1 A real-valued function f is called K -convex for K > 0, if for 
any Xq < X\ and A £ [0, 1], 

/(( 1 - A)x 0 + Axi) < (1 - A )f(x 0 ) + Xf(xi) + A K. (8.4) 

Below we summarize properties of TT-convex functions. 

Lemma 8.3.2 (a) A real-valued convex function is also 0-convex and hence K- 

convex for all K > 0. A Ki-convex function is also a K 2 ~convex function 
for Ki < K 2 ■ 

(b) If fi(y) and f 2 (y) are Ki-convex and K 2 -convex respectively, then for a, f3 > 
0. a.fi(y) + Pf 2 (y) is (aKi + (3K 2 )-convex. 

(c) If f(y) is K -convex and (, is a random variable, then E^[f(y — £)] is also 
K-convex, provided E[\f(y — C) I] < oo for all y. 

(d) Assume that f is a continuous K -convex function and f(y) — > oo as \y\ — > oo. 
Let S be a minimum point of g and s be any element of the set 

{x\x < S, f(x) = f(S) + K}. 



Then the following results hold. 
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(i) f(S) + K = f(s ) < f(y), for all y < s. 

(ii) f(y) is a non-increasing function on (— oo,s). 

(Hi) f(y) < f(z) + K for all y, z with s < y < z. 

Proof. Parts (a), (b) and (c) are straightforward and are left as an exercise. Hence 
we focus on part (d). 

Let S' be a minimum point of function / and s be any element of the set 
{x\x < S,g(x) = g(S) + K}. 

The existence of s and S is guaranteed, since g is continuous and f(y) — > oo as 
\y\ > oo. 

Consider any y and y' with y < y' < s, there exists a A € [0, 1] such that 
y' = (1 — A )y + AS. The AT-convexity of the function f(x) implies that 

f(y') < (1 - A )f(y) + A(/(S) + K) = (1 - A )f{y) + A f(s). (8.5) 

Part (d) (i) follows from (8.5) upon letting y' = s, which immediately implies part 
(d) (ii). 

Finally, consider any y and z with s < y < z. If y < S, there exists a A € [0, 1] 
such that y = (1 — A)s + AS. Since f(x) is iL-convex and S is a global minimizer 
of the function /, we have 

m < (1 - A .)(/(«) - K) + A/(S) + K = f(S) + I< < f(z) + K. 

If V > S, there exists a A € [0, 1] such that y = (1 — A)S+Az. Again the A'-convexity 
of the function / and the definition of S imply that 

f(y) < (1 - A )/(S) + A f(z) + A K < f{z) + K. 



Figure 8.1 gives an illustration of the properties of Ai-convex functions in Lemma 
8.3.2 part (d). 

Proposition 8.3.3 If fix) is a K -convex function, then function 

g(x) = min Q5(y - x) + f(y), 

y>x 



is ma.x{K,Q}-convex. 

Proof We only need to discuss the case AT > Q. In fact, when K < Q , the 
AT-convexity of f[x] implies the Q-convexity of f(x) and the Q-convexity of the 
function g(x) follows from the case for AT > Q. Hence we assume that K > Q. 

Let E = {x | g(x) = f(x)} and O = {x \ g(x) < f(x)}. We show that for any 
Xq,Xi and A € [0, 1] with xq < x±, 



g(x a) < (1 - A)g(xo) + Ap(xi) + AK, 
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FIGURE 8.1. Illustration of the Properties of A-convex Functions 



where x\ = (1 — A)iro + Aaq. We consider four different cases. 

Case 1: xq,X\ £ E. In this case, 

g{x a ) < f(x A ) 

< (1 — X)f(xo) + Xf(xi) + X K 

= (! - A)£(iEo) + Xg(x-i) + XK, 

where the second inequality follows from the A'-convexity of the function f(x). 

Case 2: xo,X\ £ O. In this case, let g(xi) = Q + f{yi) for * = 0, 1 with yi > Xi 
and let y\ = (1 — X)yo + Aj/i . It is clear that yo < y i and y\ > x\. Furthermore, 

g{x a) < Q + f{yx) 

< (1 — X)(Q + f(yo)) + X(Q + f(yi)) + XK 
= (1 ~ •%(£()) + Ay(z i) + A AT, 

where the second inequality follows from the A'-convexity of the function f(x). 

Case 3: xq £ E,x i £ O. Let g(x i) = Q + f{yi) with yi > x±. Let X\ = (1 — 
h)xq + y.yi with \x < A. Then 

g(x a) < f{x A ) 

< (1 - v)f(x o) + yf(yi) + yK 

= (1 - A)fl'(xo) + Xg{x\) + y,K 

+ (A - m)(/(x 0 ) - f{y±)) - XQ 

< (1 — X)g(xo) + Xg(xi) + yA — XQ + (A — y)Q 

< (1 - X)g(xo) + Xg(xi) + XK, 

where the second inequality follows from the Af-convexity of the function f(x) and 
the third inequality holds since f(xo) < Q + /(yi). 



Case 4: Xq £ 0,x i £ E. Let g(x o) = Q + /(yo) for y 0 > xo- We distinguish 
between two different cases. 
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Subcase 1: x\ < j/o- In tins case, 
g{x\) < Q + f(y 0 ) 

= (1 - A )(Q + f(y 0 )) + A f( Xl ) + A (Q + f(y 0 ) - /(an)) 

< (1 - A)g(xo) + A^aq) + A Q, 

where the last inequality holds since f(yo) < /(a; i). 

Subcase 2: aq > j/o- Let aq = (1 — /i)yo + /aaq with y < A. Then 

SOa) < /(^a) 

< (1 - g)f(yo) + gf(xi) + yK 

= (1 - A)^(a;o) + Ag(aq) + yK 

+ (A - y)(f(yo) - /Oi)) - (1 - X)Q 

< (1 ^ X)g(x 0 ) + Xg(xi) + XK, 

where the second inequality follows from the AT- convexity of the function f(x) and 
the last inequality holds since f(yo) < f{xi). I 

8.3.3 Main Results 

It remains to show that an (s*, St) policy is optimal for every t, t = 1, 2, . . . , T. 
For this purpose, it is sufficient to prove that the function G t (y) is K- convex, and 
G t (y) — > oo as \y\ — > oo, for each period t, t = 1, 2, . . . , T. 

Theorem 8.3.4 (a) For any t = 1,2, ...,T, G t {y ) and z t (y) are continuous 

and limiyi^oo G t (y) = oo. 

(b) For any t = 1, 2, . . . , T, G t (y) and 2*(y) are K -convex. 

(c) For any t = 1, 2, . . . , T, there exists two parameters St and St such that it is 
optimal to make an order to raise the inventory level to St when the initial 
inventory level is no more than St and to order nothing otherwise. 

Proof. We prove by induction. For t = T, G T {y) = G(y) for all y. Hence G T (y) is 
continuous, A'-convex (in fact convex) and lim^i^oo G T (y) = oo. 

Assume that G t (y) is continuous, A^-convex and limiyi^oo G t (y) = oo. Then 
Lemma 8.3.2 part (d) allows us to show that there exists two parameters s t and S t 
with s t < S t such that S t minimizes G t (y) and G t (s t ) = G t (S t ) + K. Furthermore, 

t( \ f K + G t (S t ), if y < s t , 

\y) | G t (y), otherwise. 

Since G t (s t ) = G t (S t ) + K, z t (y) is continuous and Proposition 8.3.3 implies that 
z t (y) is A'-convex. 

Finally, G* _1 (j/) = G(y) + E[z l {y — D)\. Therefore G k ~ 1 (y) is continuous, and 
from Lemma 8.3.2 part (c), G l ~ l {y) is Af-convex. Moreover hin^^oo G t ~ 1 (y) = oo, 
since z t (y) > G t (S t ) for any y. I 
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So far we assume that demands are identically distributed and the cost parame- 
ters, c, h + and h ~ , are time independent. These assumptions can be easily relaxed 
and an (s, S) policy is still optimal. Indeed, in Chapter 9, we analyze the finite 
horizon inventory and pricing model, including the inventory model analyzed in 
this section as a special case, under more general assumptions. 



8.4 Quasiconvex Loss Functions 

The above proof on the optimality of (s t , S t ) policies relies on the fact that the one- 
period loss function G(y) is convex. In many practical situations this assumption 
is not appropriate. For instance, consider the previous model, but assume that 
whenever a shortage occurs, an emergency shipment is requested. Suppose further, 
that this emergency shipment incurs a fixed cost plus a linear cost proportional 
to the shortage level. It can be easily shown that the new loss function G(y) is, in 
general, not convex. 

To overcome this difficulty, Veinott (1966) offers a different yet elegant proof for 
the optimality of ( s t ,S t ) policies under the assumption that —G(y) is unimodal 
or G(y) is quasiconvex. Here we provide a slightly simplified proof suggested by 
Chen (1996) for the model considered here. Recall the concept of quasi-convexity. 

Definition 8.4.1 A function f is quasiconvex on a convex set X if for any x and 
y G X and 0 < q < 1, 

f(qx + (1 - q)y) < ma x{f(x),f(y)}. 

As we already pointed out in Chapter 2, a convex function is also quasiconvex, 
and / is quasiconvex if 

—f(x) is unimodal. 

Consider the following T-period model: 

It) = mm{K6(y - y t ) + G t (y)} (8.6) 

V>Vt 

where 

G t (y)=G(y) + E D [z t+1 (y-D)}, for t = 1, 2, . . . , T. (8.7) 

In the analysis below we use the following assumptions on G{y). 

(i) G(y) is continuous and quasiconvex. 

(ii) G(y) > infj, G(x) + K as \y\ — > oo. 

Other assumptions on ordering costs and demands are the same as in the previous 
section. 

If (i) and (ii) hold, there is a number y* that minimizes G(y). In addition, there 
are two numbers s(< y*) and S(> y*) such that 

G(S) = G(y*) + K 
G(s) = G(y*) + K. 



( 8 . 8 ) 

(8.9) 
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It is also worth mentioning that G{y) is nonincreasing in y on (— oo,y*] and non- 
decreasing in y on (y* , oo). 

To prove the optimality of an (s t ,S t ) policy for all t, we need the next two 
lemmas. 

Lemma 8.4.2 For t = 1, . . . , T, and y < y' , 

z t (y) < z t {y') + K and (8.10) 

G\y') - G^y) > G(y') - G(y ) - K. (8.11) 

Proof. It follows that 

z t {y) = mm{G t (y),K + min x >y G 4 (x)} 

< I\ + iriin x > y G t (x ) 

< I\ + min x > y > G t (x) 

< K + z t (y'). 

We also provide an alternative proof here. The result obviously holds for y' = y. 
Now assume that y ' > y. Suppose that at the beginning of the period, the inventory 
level prior to any ordering is y. Consider the following strategy: we first raise the 
inventory level up to y' and then act optimally as if we started with the inventory 
level y' (prior to any ordering). Such a strategy incurs cost equal to K + z t (y r ). 
Because this strategy is not necessarily optimal, it follows that 

z\y) <K + z t {y '), 

which also proves (8.10). 

Inequalities in (8.10) implies that 

GV) - G\y) = G(y') - G(y) + E D [z t+ \y f - D)\ - E D [z t+ \y - D)] 

> G{y') - G{y) - K, 

which completes the proof. I 

Lemma 8.4.3 For t = 1, . . . , T, and y <y' < y* , 

G t W)-G t (y)<GW)~G(y)< 0 and (8.12) 

*V) < z\y). (8.13) 

Proof. The proof is by induction. Note that G(y) is decreasing in y for y < y* . 

For t = T, G t W)~G t ( y) = GW)— GW) < 0, which implies that min x > y > G T (x) 
min x > y G T (x). Then, 

z T W ) = min {G t W)jK + min G T (x)} 

x>y' 

< min {G T (y),K + min G T (x)} 

x>y' 

= min{G T (y), K + minG T (a;)} = z T (y). 

x>y 
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Assume that for t + 1 > 0, and y < y' < y* , 

G t+ \y') - G t+1 (y) < G(y') - G(y) < 0 and 
Z t+ \y') < z t+1 (y). 

Now it follows immediately that 

GV) - G\y) = G(y') - G(y) + E D [z t+ \y' - D)\ - E D [z t+1 (y - D)) 

= G(y') - G(y) + E D [z t+1 {y' - D) - z t+ \y - D)} 
<G(y')-G(y)< 0, (8.14) 



z t {y') = min {G t (y'),K + min G t (x)} 

x>y' 

< min {G t (y),K + minG*(x)} = z*^). 

x>y 



This completes the proof. I 

We are now ready to show the optimality result. 

Theorem 8.4.4 (Veinott, 1966) If (?') and ( ii ) hold, an ( s t ,S t ) policy is optimal 
for the model (8.6). Moreover, s < St < y* and y* < St. < S. 

Proof. The proof proceeds in several steps. We start with the assumption that 
G t (y) is continuous in y. This assumption will be confirmed at the end. 

(1) St is a global minimizer of G t (y). For this purpose, we first show that G t (y) 
is decreasing for y < y* , which follows directly from (8.12). Because G t (y) is 
continuous, there exists a number S t that minimizes G* ( y ) over [y* , S'] . Now it 
is clear that St minimizes G t (y) on (— oo,S). By the definition of S and Lemma 
8.4.2, it follows that for y > S(> y*), 

G\y) - G\y*) > G(y) - G(y*) - K 

> G(S) - G{y*) -K = 0, 

where G(y) > G(S) due to the quasiconvexity of G(y). Hence, S t . is indeed a global 
minimizer of G t {y) and y* < St < S. 

(2) There exists a number s t such that 

G\S t ) + K = G\s t ) and s < s t < y* . 

The definitions of St, s and y* imply that 

G t (S t ) + K - G\s) < G t (y*) + K - G\s ) 

< G{y*) + K- G(s) = 0, 
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where the first inequality follows from the definition that St is the minimizer of 
G t (y) while the second inequality holds due to Lemma 8.4.3. From the definition 
of y* and Lemma 8.4.2, we see 

G t (S t ) + K- G t (y*) > G(S t ) - G(y*) - K + K > 0. 

Together with the continuity assumption of G t (y) and the fact that G t (y) is 
decreasing on (— oo, y*], the above two inequalities imply that there exists a number 
St. such that 

G^St) + K = G t (s t ) and s<s t < y*. 

(3) For y* < y < y', 

[K + G t (y')]-G t (y)> 0. 

This follows directly from Lemma 8.4.2 and the fact that G(y') > G(y): 

G\y') - G t (y) > G(y') - G(y) — K > —K. 



Note that this observation implies that placing an order does not reduce the ex- 
pected cost when y > y*. 

(4) We conclude, therefore, that an ( s t ,S t ) policy is optimal. 

(5) It remains to prove that G t {y) is continuous in y. 

Again, we proceed by induction. It is true for t = T because G T (y) = G{y) (by 
assumption (i)). Suppose now that G t+1 (y) is continuous for t <T. From (4), 



z 



t+i 




I< + G t+ \S t ) 
G t+ \y ) 



if y < s t , 
if y> s t . 



Hence z t+1 is continuous. Finally, the continuity of En[z t+1 (y — D)] follows from 
the continuity of function z t+1 and the uniform continuity theorem, which basically 
says that a continuous function is uniformly continuous over a compact set. I 
The above proof for the optimality of (,s t , S t ) policies is based on the assumption 
that demands are independent and identically distributed. If demands are not 
independent and identically distributed, Lemma 8.4.3 will generally fail to hold 
for the following reason. In the proof of Lemma 8.4.3, we require that z t+1 (y' — 
D) — z t+1 (y — D) < 0 for all D in (8.14), which holds only if y — D < y’ — D < y* . 
When demands are not independent and identically distributed, the minimizer of 
G(y) may vary from period to period, and the requirement that z t+1 (y' — D) — 
zt+1 (y ~ D) < 0 may not be met. In the proof based on AT— convexity, however, 
no requirement is imposed upon demands. Thus, while the result in this section is 
more general than the results of Section 8.3 when demands are independent and 
identically distributed, it is not a generalization of the first. 



8.5 Infinite Horizon Models 

In this section we consider a discrete time infinite horizon model in which an order 
may be placed by the warehouse at the beginning of any period. To simplify the 
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analysis, we focus on discrete inventory levels and assume a discrete distribution 
of the one period demand D. Let p :j = Pr{Z? = j} for j = 0,1,2,.... The objective 
is to minimize the long-run expected cost per period. All other assumptions and 
notation are identical to those in the previous section. 

This problem has attracted considerable attention in the last three decades. The 
intuition developed in the previous section (for the finite horizon models) suggests 
and is proved by Iglehart (1963b) and Veinott and Wagner (1965), that an (s,S) 
policy is optimal for the infinite horizon case. A simple proof is proposed by Zheng 
(1991). Various algorithms have been suggested by Veinott and Wagner (1965), 
Bell (1970) and Archibald and Silver (1978) as well as others; see, for instance, 
Porteus (1990) or Zheng and Federgruen (1991). This section describes a simple 
proof for the optimality of a stationary ( s , S) policy given by Zheng (1991) and 
sketches an algorithm developed by Zheng and Federgruen (1991) for finding the 
optimal (s, S) policy. We follow those papers, as well as the insight provided in 
Denardo (1996). 

Let c(s, S) be the long-run average cost associated with the (s, S) policy. Given a 
period and an initial inventory y, recall that the loss function G(y) is the expected 
holding and shortage cost minus revenue at the end of the period. In what follows 
the loss function G(y) is assumed to be quasiconvex and G(y) — » oo as \y\ — > oo. 

Let M (j) be the expected number of periods that elapse until the next order is 
placed when starting with s + j units of inventory. That is, M(j) is the expected 
number of periods until total demand is no less than j units. It is obvious that for 
all j we have 

j oo 

M(j) =^2p k [l + M(j - k)\+ £ Pk (8-15) 

k—0 k=j -\- 1 

oo 

= £p fc M(j -k) + 1, 
fc= o 

with M(j) = 0 for j < 0. 

Let T{s, y) be the expected total cost in all periods until placing the next order, 
when we start with y units of inventory. 

Observe that since orders are received immediately, each time an order is placed 
the inventory level increases to S. Hence, replenishment times can be viewed as 
regeneration points, see Ross (1970). The theory of regeneration processes tells us 
that 

( F(s,S) f ^ 

c(s ' s) = «(svy (8 - 16) 

That is, c(s,S), the long-run average cost, is the ratio of the expected cost be- 
tween successive regeneration points and the expected time between successive 
regeneration points. 

To calculate M(S — s), one need only solve the recursive equation (8.15). In 
addition, 

F(s,S) = K + H(s, S), 
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where H(s,y) is the expected holding and shortage cost until placing the next 
order, when starting with y units of inventory. How can we calculate the quantity 
H(s, 5)? For this purpose, observe that M(j + 1) > M(j) and let 

m(j) = M(j + 1) - M(j), 



for j = 1, 2, 3, . . .. To interpret m(j), observe that for any j, j < S — s, M (j + 1) 
is the expected time until demand exceeds j units. Thus, the definition of M (j) 
implies that m(j) is the expected number of periods, prior to placing the next 
order, for which the inventory level is exactly S — j. Hence, 

S—s—l 

H(s, S) = J2 m (j)G(S — j). (8.17) 

j = o 

An alternative way of computing H(s,y ) is as follows: 

OO 

H{s,y) = G(y) + ^PjH{s,y - j), for y > s, (8.18) 

j=o 



and H(s, y) = 0 for y < s. To summarize, for a (s, S) policy we have 



c{s,S) 



K + Ef=o _1 m U)G( s - j) 

M(S - s) 



Let y* be any minimizer of the loss function G. Zheng and Federgruen’s algo- 
rithm as well as Zheng’s proof is essentially based on the following results, which 
characterize the properties of the optimal (s, S) policies. 



Lemma 8.5.1 For any given (s, S) policy, there exists another (s', S') policy with 
s' < y* < S' such that c(s ' , S') < c(s, S). 



Proof. Observe that G(y) is a quasiconvex function of y and therefore G(y) is 
nonincreasing for y < y* and nondecreasing for y > y*. Consider now s > y* . 
Equation (8.17) together with the quasiconvexity of G(y) implies that H(s — l,S — 
1) < H(s, S). Hence, c(s, S) > c(s — 1, S — 1). Suppose now that y* > S. A similar 
argument shows that if(s + l,S+l) < H(s,S) and hence c(s,S) > c(s + 1,5 + 1) 
which completes the proof. I 

The following result is useful for our analysis. 

Lemma 8.5.2 Assume s° < y* < S. For a given p, we have that 

(a) If p < G(s°), then for any s < s°, there exists 0 < j3 < 1 such that 

c(s, S) > (3c{s° , 5) + (1 — (3)p. 

(b) If p> G(s° + 1), then for any s° < s < y* , there exists 0 < /? < 1 such that 

c(s°, 5) < /3c(s, S) + (1 — P)p. 




8.5 Infinite Horizon Models 



133 



Proof. For part (a), let (3 = M(S — s°)/M(S — s) and observe that 0 < /? < 1. 
From the definition of c(s, S), we have 

, _ K + YfjZo _1 m(j)G(S - j) + E jZsZ]o m{j)G{S - j) 

M(S-a) 

c(s°, S)M(S - 8°) + EjZs-l° j) 

M(S - s ) 

^ c(s°, S)M(S - 8°) + ZjZs-l° ™U)p 
M(S- s) 

= pc(s°,S) + (l-(3)p, 

where the inequality holds since the loss function G is quasiconvex. Finally, the 
proof of part (b) follows from a similar argument and is left as an exercise. I 
We are ready to provide a useful characterization of the optimal reorder levels 
for a given order-up-to level. 

Lemma 8.5.3 For a given order-up-to level S, a reorder level s° < y* is optimal 
(i.e., c(s°, S) = min s <5 c(s, S)) if 

G(s°) > c(s°, S) > G(s° + 1). (8.19) 

Similarly, for any order-up-to level S, there exists an optimal reorder level s° such 
that s° < y* and (8.19) holds. 



Proof The optimality of s° for s° satisfying (8.19) follows from Lemma 8.5.2 upon 
letting p = c(s°, S). 

We now prove the second part of the result. For any s < y* , there exists an 
s° < y* such that G(s°) > c(s, S) > G(s° + 1) since G(y) — > oo for y — > oo 
and c(s,S) > min x G(x). Upon letting p = c(s,S), Lemma 8.5.2 implies that 
c(s°,5) < c(s,S). If s° satisfies (8.19), then we are done; otherwise, there exist 
s 1 < y* such that s 1 > s° and G(s 1 ) > c(s°,S) > G(s 1 + 1). Again from Lemma 
8.5.2, we have c(s 1 ,S') < c(s°,S). If s 1 satisfies (8.19), we are done; otherwise 
repeat this process. This process has to be finite since y* is an upper bound and 
thus we end up with a reorder point satisfying (8.19), which is optimal from the 
first part of the result. I 

An immediate byproduct of the lemma is an algorithm for finding an optimal 
reorder point s° for any given S. 

Corollary 8.5.4 For any value of S, s° = ma x{y < y*\c(y,S) < G(y)} is the 
optimal reorder level associated with S. 

Proof Let 

M(S - s - 1) 

M(S - s) 



a = 
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and observe that (8.16) and (8.17) imply that 

c(s, S) = ac(s + 1, S) + (1 — a)G(s + 1). (8.20) 

The definition of s° implies that 

G(s°) > c(s°, S) and G(s° + 1) < c(s° + 1, S). 

In addition, using (8.20), we have c(s°,S) > G(s° + 1). Hence, (8.19) holds and 
by Lemma 8.5.3, s° is an optimal reorder point associated with S. I 

Lemma 8.5.5 For two order-up-to levels S°,S > y* , let s° and s with s°,s < y* 
be the corresponding optimal reorder points, respectively. Moreover, assume that 

G(s°) > c(s°, S°) > G(s° + 1). 

The (s,S) policy improves on (has smaller cost than) ( s°,S ° ) if and only if 

c(s°, S) < c(s°, S°). 



Proof. We need only show that if c(s, S) < c(s°, S°), then c(s°, S) < c(s° , S°). By 
contradiction, assume c(s°, S ) > c(s°, S' 0 ). Upon letting p = c(s°, S) > c(s°, S°) > 
G(s° + 1), we have from Lemma 8.5.2 part (b) that 

c(s, S) > c(s°, S) > c(s°, S°), 



which is a contradiction. I 

Finally, we provide a characterization of the optimal order-up-to level for a given 
reorder level s. For this purpose, define 



</>(*, s,S) = 



0, if i < s, 

G(i ) — c(s, S) + Y^LoPj^i ~ Ji s ) <S), otherwise . 
From the recursive forms (8.15) and (8.18), we have that for i > s, 
<j>(i, s , S) = H(s, i) — c(s, S)M(i — s), 



( 8 . 21 ) 



and 4>(S, s, S ) = —K. 

Lemma 8.5.6 For a given reorder level s, if an order-up-to level S° is optimal 
(c(s, S°) = inf s <g c(s, S ) ), then c(s, S°) > G(S ° ) . 



Proof. Assume that c(s, S) < G(S) for some S > s. Then there exists an inventory 
level i with s < i < S such that — K = (f(S, s, S) > s, S). This implies that 



c(s, S) > 



K + H(s, i) 
M(i - s) 



c(s,i). 



Thus S can not be optimal. I 

From the above proof, we can also see that if an order-up-to level S is optimal 
for a given reorder level s, then f>(i, s, S ) > —K = <p(S, s, S ) for any i. The fol- 
lowing characterization of the properties of the best (s, S) policy is an immediate 
consequence of Lemma 8.5.1, Lemma 8.5.3, Lemma 8.5.6 and the above observa- 
tion. 
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Lemma 8.5.7 There exists an ( s*,S *) policy such that the following holds. 

(a) c* = c(s*, S*) = inf s < s c(s, S). 

(b) s* <y* < S*. 

(c) G(s*) >c*> G(s* + 1). 

(d) G(S*) < c*. 

(e) 4>(i) > —K = <f(S*) for any i, where </>(i) = </>(i,s*,S*). 

Furthermore, these results suggest the following simple algorithm. Start with 
S° = y* and find the best reorder point s° applying Corollary 8.5.4. Now increase S 
by increments of 1 each time comparing c(s°, S°) to c(s°, S). If c(s°, S) < c(s°, S' 0 ), 
set S° = S and find the corresponding reorder point. Continue until you’ve identi- 
fied (s°, S°) such that no S,S > S° has c(s°, S) < c(s°, S°) and G(S) > c(s°, S°). 

So far we characterized the properties of the best (s,S) policy, the (s*,S*) 
policy, and described how to find such a policy. We are now ready to prove that 
this stationary (s*, S*) policy is optimal for the infinite horizon model. Of course 
as is common for the general infinite horizon dynamic program, one might attempt 
to prove that there exists a function h such that the following optimality equation 
holds 

OO 

h(x) + c* = min KS(y — a:) + G(y) + Pjhfy — j). (8.22) 

y>x ' 

3=0 

In fact, one can prove that the function <j> defined in Lemma 8.5.7 satisfies the above 
optimality equation (8.22). Unfortunately, since the function h is unbounded, there 
is no result in dynamic programming which allows us to claim the optimality of the 
stationary (s*, S*) policy without further justification. Hence we follow a different 
approach. In particular, we focus on a relaxed model where negative order is 
allowed and whenever a negative order is placed, a fixed cost K is charged. 

We construct a bounded function h satisfying the optimality equation for the 
relaxed model 



h{x) + c* = min KS(\y - x|) + G(y) J r^p j h(y - j). (8.23) 

V 3=0 

The construction of function h is as follows: 

( 0, if i < s*, 

h(i) = < 0(i), for s* < i < S* , (8.24) 

[ min{0,O(i)}, otherwise, 

where 0(i) = G(i) - c* + Y^LoPjKi ~ j)- 

We now prove that — K < h(i ) < 0 for any i. First notice that 0(i) = <j>(i) 
for i < S* and hence from Lemma 8.5.7 part (e), we have that h(i) > —K for 
i < S*. Moreover, using Lemma 8.5.7 part (c), we can show that h(i) < 0 for 
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any s* < i < S* and consequently h(i) < 0 for any i. Thus it suffices to prove 
0(i ) > —K for i > S*. Assume to the contrary that there exists an i' such 
that 0{i') < —K and without loss of generality let i' be the smallest one. Then 
h(i) > —K for any i < i' . In addition, there must exist an i" such that S* < i" < i' 
and 0(i ") > 0; otherwise for any i with s* < i < i', 0(i ) < 0 and therefore 
h{i) = O(i) = 4>(i) > —K from Lemma 8.5.7 part (e). This implies that G{i") > c*. 
However, since G is quasi-convex and i" > S* > y*, we can prove by induction 
that h(i) > —K for any i. This is a contradiction since h(i') = 0(i') < —K. Hence 
—K < h(i ) < 0 for any i. 

In summary, —K < h(i ) < 0 for any i, h(S*) = —K and 0(i ) > — K for any 
i > s*. It is straightforward to verify that h(i) satisfies the optimality equation of 
the relaxed model, (8.23), and a modified (s*, S*) policy attains the minimization 
in the optimality equation. In the modified policy, make an order to raise the 
inventory level to S* whenever the initial inventory level is no more than s*; do 
not make any order when the initial inventory lies between s* + 1 and S*; for 
inventory level above S*, make a negative order to reduce the inventory level to 
S* or do nothing depending on which choice is more cost effective. 

We claim that the modified (s*, S*) policy is optimal for the relaxed model and 
its associated long-run average cost c* is optimal. Indeed, this claim follows from 
well known results for infinite horizon dynamic programming under average cost 
criterion since as we just proved, the function h is bounded; for details one may 
refer to any standard dynamic programming textbook, for instance, Theorem 2.1, 
p. 93 in Ross (1983). Also observe that the modified (s*,S*) policy is different 
from the (s*,S*) policy in at most one period: when the initial inventory level 
is too high, we may make a negative order to reduce the inventory level to S* 
and after that the inventory level will never exceed S*. Because the outcome of 
finite number of periods will not affect the long-run average cost, it is safe to 
claim that the stationary ( s*,S *) policy is optimal for the relaxed model and its 
associated cost c* is the optimal average cost. Finally notice that the stationary 
(s*,S*) policy is feasible for the original model, and the optimal average cost of 
the original model is no less than the optimal average cost of the modified model. 
Thus, this stationary (s*,S*) policy is optimal for the original infinite horizon 
model and its associated cost c* is the optimal long-run average cost. 

We conclude this section with a discussion of the impact of leadtimes on the 
analysis. So far we have assumed zero leadtimes; if this fails to hold, and a fixed 
delivery leadtime has to be incorporated, the problem can be transformed into 
one with zero leadtime by a fairly simple change in the loss function G(-); see, for 
instance, Veinott and Wagner (1965), Veinott (1966), Heyman and Sobel (1984) 
or the third exercise at the end of this chapter. For this purpose, let the inventory 
position at the warehouse be defined as the inventory at that warehouse plus 
inventory in transit to the warehouse. The loss function G(y) is calculated such 
that y is the inventory position and D is the total demand during the leadtime 
plus one period. 
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Consider a distribution system with a single warehouse, denoted by the index 0, 
and n retailers, indexed from 1 to n. Incoming orders from an outside vendor with 
unlimited stock are received by the warehouse that replenishes the retailers. We 
refer to the warehouse or the retailers as facilities. The transportation leadtime to 
facility i = 0, 1, 2, . . . , n, is a constant L,;. 

As in the previous section, we analyze a discrete time model in which customer 
demands are independent and identically distributed and are faced only by the 
retailers. Every time a facility places an order, it incurs a set-up cost Ki, i = 

0. 1, 2, . . . , n. The echelon inventory holding cost (see Chapter 6) is hf at facility 

1, i = 0,1,2 Finally, demand is backlogged at a penalty cost of h~ , i = 
1,2, ...,n per unit per period. The objective is to find a centralized strategy, 
that is, a strategy that uses systemwide inventory information, so as to minimize 
long-run average system cost. 

As the reader no doubt understands, the analysis of stochastic distribution mod- 
els is quite difficult and finding an optimal strategy is close to impossible; consider 
the difficulty involved in finding an approximate solution for its deterministic, 
constant demand counterpart; see Chapter 6. As a result, limited literature is 
available. The rare exceptions are the approximate strategy suggested by Eppen 
and Schrage (1981) and the lower bounds developed by Federgruen and Zipkin 
(1984a-c) and Chen and Zheng (1994). We briefly describe these two bounds here. 

For this purpose, let the echelon inventory position at a facility be defined as 
the echelon inventory at that facility plus inventory in transit to that facility. 

Consider the following approach suggested by Federgruen and Zipkin (1984a-c). 
Given an inventory position yi at retailer i, let the loss function Gi{yi) be 

Gi(yi) = hi max{0, yi - D} + (h~ + hi) max{0, D - y»}, 

where D is total demand faced by retailer i during L.j + 1 periods (see the end of 
the previous section for a discussion). 

Consider now any inventory policy with echelon inventory of y units at the 
warehouse and inventory position yi at retailer i. The expected one period holding 
and shortage cost in the system is 

n 

G(y ) = hi (y- y) + ^ <?*(&), 

i- 1 

where y is the expected single period systemwide demand. Since, by definition, 
V — E”-i Vii a lower bound on G(y) is obtained by finding 

n n 

Go{y) = min \hl(y - y) + ^G*(j/i)| ^Vi < 2/}- (8.25) 

yi,—,y n ^ J 

i—i i—i 

Thus, a lower bound on the long-run average system cost C FZ is obtained by 
solving a single facility inventory problem with loss function Go and set-up cost 
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Kq. Notice that this bound does not take into account the retailer-specific set-up 
costs. This is incorporated in the next lower bound of Chen and Zheng (1994). 

To describe their lower bound consider the following assembly-distribution sys- 
tem associated with the original distribution system. In the assembly-distribution 
system each retailer sells a product consisting of two components. A basic compo- 
nent, denoted by ao and a retailer-specific component, denoted by a*. Each retailer 
receives component ao from the warehouse which receives it from the outside sup- 
plier. On the other hand, component a.i is supplied directly from the vendor to 
retailer i. The arrival of a basic component at retailer i is coordinated with the ar- 
rival of component a,. That is, at the time the warehouse delivers basic components 
to retailer i, the same number of at components are shipped to the retailer from 
the supplier. These two shipments arrive at the same time and the final product 
is assembled, each containing one basic component and one component. 

To ensure that the original distribution system and the assembly-distribution 
system are, in some sense, equivalent, we allocate cost in the new system as follows. 
Associated with retailer i is a single facility inventory model with set-up cost Ki, 
holding cost hf and shortage cost Kq + . Delivery leadtime to the facility is 

Li and demand is distributed according to demand faced by retailer i. This is, 
of course, a standard inventory model for which an (sj,Sj) policy is optimal. Let 
Ci be the long-run average cost associated with this optimal policy. Given an 
inventory position y, let Gi(y) be the associated loss function. Finally, let 



G\{y) 



Ci if y < Si 

Gi(y ) if y > Si 



and G° i (y) = Gi(y)-G i i (y). 

In the assembly-distribution system costs are charged as follows. A set-up cost 
Kq is allocated to the basic component and a set-up cost Ki to each component a,;, 
and an expected holding and penalty cost, that is, loss function, of G° to the basic 
component and G\ to component ai. Notice that since shipments are coordinated, 
there is no difference between long-run average cost in the original system and in 
the assembly-distribution system. 

To find a lower bound on the long-run average cost of the original system, 
we consider a relaxation of the assembly distribution system in which the basic 
components can be sold independently of the other components. Thus, Ci, i = 
1,2 ,n is exactly the long-run average cost associated with the distribution of 
component a.j. Let Co be a lower bound on the long-run average cost of the basic 
component. Consequently, Y^i=o is a l° wer bound on the long-run average cost 
of the original distribution system. 

It remains to find Co ■ This is obtained following the approach suggested by 
Federgruen and Zipkin and described above. For this purpose, we replace G; by 
G° in (8.25) and take C FZ as Gq. 




8.7 Exercises 



139 



8.7 Exercises 



Exercise 8.1. In (8.1), we assume that F(D) is continuous. Now suppose that 
F(D) is not necessarily continuous. Does there exist an S such that z(y) is mini- 
mized at y = S ? If there exists such an S, how can you determine it ? 

Exercise 8.2. Prove (8.20). 

Exercise 8.3. Consider the single warehouse inventory model analyzed in Section 
8.5 with leadtime l > 0. Prove that the inventory on hand at the end of period t 
for some t > l can be written as 



t 

S t -i - A, 

i=t—l 

where St-i is the order-up-to-level in period t — l and D, is the demand in period i. 
Conclude that any nonzero leadtime model can be replaced by a model with zero 
leadtime for which the loss function G(y) is calculated according to (8.2) with y 
being the inventory position and D the total demand during the leadtime. 

Exercise 8.4. It is now June and your company has to make a decision regarding 
how many skijackets to produce for the coming Winter season. It costs c dollars to 
produce one skijacket which can be sold for r dollars. Skijackets not sold during 
the Winter season are lost. Suppose your marketing department estimates that 
demand during the season can take one of the values D\, D 2 , ■ ■ ■ , A, k > 3. Since 
this is a new product, they do not know what probabilities to attach to each 
possible demand that is, they do not have estimates of p,, the probability that 
demand during the Winter season will be Di, i = 1, 2, . . . , k. They have, however, 
a good estimate of average demand p, and the variance of the demand a 2 . Your 
objective is to find production quantity y that will protect you against the worst 
probability distribution possible while maximizing profit. For this purpose you 
would like to consider the following optimization model. 

MAXIMIZE y MINIMIZE Pl ..., Pk ev Average Profit, (8.26) 

where V is the set of all possible discrete distribution functions with mean /1 and 
variance a 2 . 

(a) Write an expression for the average profit as a function of the production 
quantity y and the unknown probabilities Pi,P 2 , • • • ,Pk- 

( b ) Suppose we have already determined the production quantity, y. Write a 
linear program that identifies the worst possible distribution, that is, the 
one that minimizes average profit. 




140 8. Stochastic Inventory Models 

(b) Given a value of y characterize the worst possible distribution; that is, iden- 
tify the number of demand points that have positive probabilities in the 
probability distribution found in the previous question. 

(c) Can you formulate a linear program that finds the optimal production quan- 
tity; that is, can you write a linear program that solves equation (8.26)? 



Exercise 8.5. Consider the following discrete version of the newsboy problem. 
Demand for product can take the values D i, Di , . . . , D n , n > 3, with probabilities 
Pi,P 2 , ■ ■ ■ ,Pni where Y^ii=iPi = 1- Let r be a known selling price per unit and 
c be a known cost per unit. Our objective is to find an order quantity y that 
maximizes expected profit. Prove that the optimal order quantity that maximizes 
the expected profit must be one of the demand points, Di, D%, . . . , D n . 

Exercise 8.6. Prove Lemma 8.3.2 parts (a), (b) and (c). 

Exercise 8.7. Consider the newsboy problem with demand D being a random 
variable whose density, /(£>), is known. Let r be a known selling price per unit 
and c be a known cost per unit. Assume no initial inventory and no salvage value. 
The objective is to find an order quantity y that maximizes expected profit. 

(a) Let a service level be defined as the probability that demand is no more 
than the order quantity, y. Our objective is to find the order quantity, y, 
that maximizes expected profit subject to the requirement that the service 
level is at least a. What is the optimal order quantity as a function of a, c, 
r and f(D). 

( b ) Suppose there is no service level requirement; however, there is a capacity 
constraint, C, on the amount we can order. That is, the order quantity, 
■y, cannot be more than C. What is the optimal order quantity, y, that 
maximizes expected profit subject to the capacity constraint, C. 

(c) Suppose there is a service level requirement, a, and a capacity constraint, 
C. What is the optimal order quantity, y, that maximizes expected profit 
subject to the constraints that service level is at least a and the capacity 
constraint, C. 



Exercise 8.8. Prove that a real-valued function / is A'-convex if and only if for 
any z > 0, b > 0 and any y, we have 

K + f{y + z) > f(y) + f(y ) - f(y - b)). 



Exercise 8.9. Prove Lemma 8.5.2 part (b). 
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9.1 Introduction 

In Chapter 8 we analyze the traditional stochastic inventory models. Those mod- 
els focus on effective replenishment strategies and typically assume that a com- 
modity’s price is exogenously determined. In recent years, however, a number of 
industries have used innovative pricing strategies to manage their inventory effec- 
tively. For example, techniques such as revenue management have been applied 
in the airlines, hotels, and rental car agencies-integrating price, inventory control, 
and quality of service, see Kimes (1989). In the retail industry, to name another 
example, dynamically pricing commodities can provide significant improvements 
in profitability, as shown by Gallego and van R.yzin (1994). 

These developments call for models that integrate inventory control and pricing 
strategies. Such models are clearly important not only in the retail industry, where 
price-dependent demand plays an important role, but also in manufacturing en- 
vironments in which production/distribution decisions can be complemented with 
pricing strategies to improve the firm’s bottom line. 

The coordination of replenishment strategies and pricing policies has been the 
focus of many papers, starting with the work of Whitin (1955) who analyzed the 
celebrated newsvendor problem with price dependent demand. For a review, the 
reader is referred to Eliaslrberg and Steinberg (1991), Petruzzi and Dada (1999), 
Federgruen and Heching (1999), Yano and Gilbert (2002), Elmaghraby and Ke- 
skinocak (2003) or Chan, Siren, Simchi-Levi and Swann (2003). The single period 
models analyzed in Section 9.2 seem to be new; similar results under different 
assumptions appear in Agrawal and Seslradri (2000). The development of our fi- 



141 




142 



9. Integration of Inventory and Pricing 



nite horizon models in Section 9.3 is essentially based on Chen and Simchi-Levi 
(2002a). In Section 9.4, we focus on risk averse inventory (and pricing) models 
proposed by Chen, Sim, Simchi-Levi and Sun (2004). 



9.2 Single Period Models 

We start by analyzing the single period problem in which a risk-neutral retailer 
has to decide on its stock level and the selling price of a single product. Contrary 
to the newsvendor model discussed in Chapter 8, demand depends on the selling 
price and hence is endogenously determined. In particular, for a given selling price 
p, the demand has the following form. 

Assumption 9.2.1 The demand function satisfies 

D(p,e) = aD(p) + f3, (9.1) 

where e = (a,/3), and a is a nonnegative random variable with E[a\ = 1 and 
E[(3] = 0. Furthermore, D(p ) is continuous and strictly decreasing, and the ex- 
pected revenue 

R(d) = dD~ 1 (d), 

is a concave function of the expected demand d. 

An implicit assumption here is that the realized demand D{p , e) is always non- 
negative, which imposes some conditions on the selling price and the two random 
variables a and ft. Observe that, by scaling and shifting, the assumptions E[a\ = 1 
and E[(3\ =0 can be made without loss of generality. A special case of this demand 
function is the additive demand function. In this case, the demand function is of 
the form D(p, e) = D(p) + /?. Another special case of the demand function (9.1) is 
a model with multiplicative demand. In this case, the demand function is of the 
form D{p , e) = aD{p), where a is a random variable. Finally observe that special 
cases of the function D(p) include D(p) = b — ap (a > 0, b > 0) in the additive case 
and D(p) = ap~ b (a > 0, b > 1) in the multiplicative case; both are commonly 
used in the economics literature. 

In the single period model, an ordering and pricing decision is made before the 
realization of the demand. The unit ordering cost is c and unsatisfied demand is 
filled with an emergency order. Let h{ x) be the inventory holding/disposal cost or 
the emergency ordering cost when the inventory level after satisfying the demand 
is x. A common form of h(x) is as follows. 

h(x) = h + max{0, #} + h~ max{0, — a?} , (9.2) 

where h + is the unit inventory holding/disposal cost if h + is nonnegative or the 
unit salvage value if it is negative, and h~ is the unit cost for the emergency order. 




9.2 Single Period Models 



143 



We assume that hfx) is convex and 0 is a minimizer of the function cx + h{x). For 
h(x) having the particular form (9.2), the above assumptions imply that 

h~ > c> max{0, — h + }, 

that is, the salvage value is no more than the normal unit ordering cost, which in 
turn is no more than the unit cost for the emergency order. 

For a given stock level y and a selling price p, the expected profit of the retailer 
is calculated as follows, 

v(y,p) = E[pD(p , e)] - cy - E[h(y - D(p, e))]. 

Assumption 9.2.1 implies that there is a one-to-one correspondence between the 
selling price p and the expected demand d. Thus we have an equivalent represen- 
tation for the retailer’s expected profit: 

(t>{y , d) = R(d) - cy - E[h(y - ad - f3)\ . 

The objective of the retailer is to find a stock level and a selling price, correspond- 
ingly an associated expected demand, so as to maximize the retailer’s expected 
profit, namely 

max <f>{y,d), (9-3) 

y>Q,d£[d,d\ 

where d and d are the lower and upper bounds of the expected demand correspond- 
ing to the upper and lower bounds of the selling price. Notice that cf>(y, d) is jointly 
concave in y and d and hence the above optimization can be solved efficiently. 

Our intention here is to compare the selling prices under deterministic and 
stochastic demands. In particular, we show that there is a significant difference 
between the additive demand case and the multiplicative demand case. Before we 
proceed to our main result of this section, we need the following lemma. 

Lemma 9.2.2 Let f be a convex function over 5ft. Then for any x,d,p> 0, 

E[f{x - ad)] < E[f(x + rj- a(d + ??)], 

where a is a nonnegative random variable with E[a\ = 1. 

Proof. Notice that a convex function has nondecreasing difference. Hence we have 
that for any x, d , p, a > 0, 

f(x - ad) - /( x -ad -(a - l)p) < f(x) - f(x -(a - 1 )rj). 

Taking expectation on both sides of the above inequality and using Jensen’s in- 
equality give us the result. I 

Now we are ready to present one of our main results of this section. 

Theorem 9.2.3 The optimal selling price for the additive demand case equals the 
optimal selling price for the deterministic demand case, which, on the other hand, 
is no more than the optimal selling price for the multiplicative demand case. 




144 



9. Integration of Inventory and Pricing 



Proof. It suffices to prove that there exist (y d , d d ), (y*, d*), (y^, djjj such that 
d* a = d* d > d* m , where \y* d ,d* d ), (y*,d* a ), (y* m ,d* m ) are optimal solutions for problem 

(9.3) when demand is deterministic, additive and multiplicative respectively. 

First notice that 

(f>(y, d) = R(d) — cd — E[c(y — ad — ft) + h(y — ad — f3)\ . 

For the deterministic demand case with a = 1 and /3 = 0, since 0 is a minimizer 
for the function cx + h(x), it is optimal to set a selling price such that the realized 
demand is d d , which solves 

max R(d) — cd, 
d£ [d,d] 

and to order exactly the demand, that is, y d = d* d . 

Now we prove that there exists an optimal solution (y*, d*) for problem (9.3) 
with additive demand such that d* a = d* d . If d* a < d* d , then (y* + 77 , d* + q) gives an 
objective value no less than that given by (y*, d*) for a sufficiently small positive 
q. If d* a > d* d , we distinguish between two cases. First, y* > 0. In this case, 
( y * — 77 , d* — 77 ) gives an objective value no less than that given by (y*, d*) for 
a sufficiently small positive e. Second, y* = 0. In this case, (0, d* a — rj) gives an 
objective value no less than that given by (y*, d*) for a sufficiently small positive 77 , 
since 0 is a minimizer of the function cx + h{x). Therefore, there exists an optimal 
solution ( y*,d* ) for problem (9.3) with additive demand such that d* a = d* d . 

Finally, we argue that there exists an optimal solution ( 7 /m,cCJ for problem 

(9.3) with multiplicative demand such that d ^ < d* d . Assume that d * m > d* d . Again 
we distinguish between two cases. First, y^ > 0. In this case, Lemma 9.2.2 implies 
that ( y ^ — 77 , d^ — 77 ) gives an objective value no less than that given by (y^, d JjJ 
for a sufficiently small positive 77 . Second, 77 ^ = 0. Similarly to the argument for 
the additive demand case, ( 0 , d ^ — 77 ) gives an objective value no less than that 
given by (ymi^m) f° r a sufficiently small positive 77 , since 0 is a minimizer of the 
function cx+h(x). Therefore, there exists an optimal solution (y^, d* m ) for problem 

(9.3) with multiplicative demand such that d^ < d*. I 

The above theorem thus implies that there is a significant difference between 
the additive demand case and the multiplicative demand case. To understand 
this difference, notice that the variance of the additive demand is independent of 
the selling price while the variance of the multiplicative demand is a decreasing 
function of the selling price. Thus for the multiplicative demand case, the retailer 
tends to choose a higher selling price so as to decrease the variability of the demand. 

In the above discussion, we assume zero initial inventory level and zero fixed 
ordering cost. Now let x be the initial inventory level, y be the target stock level 
and also assume that the fixed ordering cost is K. In this case, we face the following 
problem. 

max —KS(y — x) + <j>{y, d) + cx, (9.4) 

y>x,d£[d,d\ 

where S(u) = 1 for u > 0 and d(0) = 0. 

In the following, we will show that a simple policy, referred to as (s, S, p) policy, 
is optimal for problem (9.4). In such a policy, the inventory is managed based on 
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an (s, S ) policy and the optimal price p(x) is a function of the initial inventory 
level x. Moreover, for the special case with zero fixed ordering cost, a base stock 
list price policy is optimal: the inventory is managed based on a base stock policy 
and the optimal price is a non-increasing function of the initial inventory level. 

Theorem 9.2.4 For problem (9-4), an (s,S,p) policy is optimal. Furthermore, 
for the special case with zero fixed ordering cost, a base stock list price policy is 
optimal. 

Proof. First notice that Theorem 2.3.6 implies that the function </>(; r, d) is super- 
modular. Thus, from Theorem 2.3.7, there exists a nondecreasing function d{x) 
such that d{x) maximizes <f>(x , d) for any given inventory level x. 

Now let S' be a maximizer of the function f>{x , d(x)) + cx and s satisfy 

cj>(s, d(s)) + cs = <f(S , d(S)) + cS — K. 

Since <p(x,d) is jointly concave in (x,d), <fi(x,d(x)) is a concave function. This 
allows one to show that the optimal inventory level is managed based on the ( s , S) 
policy. Moreover, the optimal price is a function of the initial inventory level: if 
x is no more than s, the optimal price is D^ 1 (d(S)); if x is greater than s, the 
optimal price is the D~ l {d{x)). 

Finally, for the special case with zero ordering cost, we have s = S and the 
optimal selling price is D~ 1 (d(max(S, x))). Hence a base stock list price policy is 
optimal. I 



9.3 Finite Horizon Models 

9.3.1 Model Description 

In this section, we focus on a finite horizon model. Unlike the single period models, 
the structure of the optimal policies are significantly different between the additive 
demand case and the multiplicative demand, as we will demonstrate in this section. 

Consider a firm that has to make replenishment and pricing decisions over a 
finite time horizon with T periods. 

Demands in different periods are independent of each other. For each period t, 
t, = 1,2 ... ,T, let d t be the demand and p t be the selling price in period t. We 
assume that dt = at.Dtfpt) + /?t, which is time dependent and satisfies Assumption 
9.2.1. Notice that in this section, the random perturbation e, the demand function 
D(p , e) and the expected revenue function R(d) are indexed by t to denote time 
dependence. The selling price p t is restricted in an interval. In particular, let p f 
and p t be the lower and upper bounds of the selling price pt , respectively. 

Let Xt be the inventory level at the beginning of period t, just before placing an 
order. Similarly, y t is the inventory level at the beginning of period t after placing 
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an order. The ordering cost function includes both a fixed cost and a variable cost 
and is calculated for every t, t = 1 , 2 ,..., as 

KS(y t - x t ) + c t (y t - x t ). 

Lead time is assumed to be zero and hence an order placed at the beginning of 
period t arrives immediately before demand for the period is realized. 

Unsatisfied demand is backlogged. Let x be the inventory level carried over 
from period t to the next period. Since we allow backlogging, x may be positive or 
negative. A cost h t (x) is incurred at the end of period t which represents inventory 
holding cost when x > 0 and shortage cost if x < 0. 

Given a discount factor 7 with 0 < 7 < 1, an initial inventory level, Xi = x, and 
a pricing and replenishment policy, let 

T 

v t(x) = y ^'y t ~ 1 (-KS(y t - x t ) - c t (y t - x t ) - h t (x t+ i) + p t D t {p t , e t )), (9.5) 

t= 1 

be the T-period total discounted profit for a realization of the random perturba- 
tions e t , where x t+1 = y t - D t (p t , e t ). 

The objective is to decide on ordering and pricing policies so as to maximize total 
expected discounted profit over the entire planning horizon, that is, the objective 
is to maximize 

E\y}{x)\ (9.6) 

for any initial inventory level x and any 0 < 7 < 1. 

To find the optimal strategy that maximizes (9.6), let Vt(x) be the maximum 
total expected discounted profit when T — t periods remain in the planning horizon 
and the inventory level at the beginning of period t is x. A natural dynamic 
program that can be applied to find the policy maximizing (9.6) is as follows. For 
t = 1,2, ... ,T, 



v t {x)=c t x+ max -K5(y - x) + f t (y,p) (9.7) 

y>x,p t >p>p t 

with vt+ i(x) = 0 for any x, where 

ft(y,p) ■= -(ky + E[pD t (p, e t ) - h t (y - D t (p, e t )) + JVt+i(y ~ D t (p, e t ))]. 

Observe that the single period profit function 

-Ct(y -x) + E\pD t (p, e t ) - h t (y - D t (p, e t ))] 

is not necessarily a concave function of the selling price p, since D t (p, et) may be a 
nonlinear function of p. Fortunately, for the general demand functions (9.1), we can 
represent the formulation (9.7) only with respect to expected demand rather than 
with respect to price, which allows us to show that the single period profit function 
is jointly concave in terms of the inventory level and expected demand. Note that 
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there is a one-to-one correspondence between the selling price pt G \p t ,Pt] and the 
expected demand D t (pt) G [d t ,dt], where 

d t = D t (p t ) and d t = D t (p t ). 

We denote the expected demand at period t by d = D t (p). Also let 

(t>t{x) = v t (x) - c t x, hj (y) = h t (y) + (c t - jct+i)y, and R t (d) = R t (d) - c t d , 
where Ct+ i = 0 and R t is the expected revenue function with 

Rt(d) = dD~ 1 (d), 

which by Assumption 9.2.1 is a concave function of expected demand d. These 
functions, <j>t{x), tij(y) and Rt(d ), allow us to transform the original problem to a 
problem with zero variable ordering cost. 

Specifically, the dynamic program (9.7) can be written as 

<t>t (x) = max —KS(y - x ) +g t (y,d t (y)) (9.8) 

y>x 

with (j>T+i{x) = 0 for any x, where 

9t(y, d) = H t (y > d ) + i E [4>t+i(y - a t d - p t )\, (9.9) 

H?(y, d) := -E[hUv ~ <* t d - (3 t )\ + R t (d), 

and 

d t {y) G argma x St > d >^g t (y,d). (9.10) 

Thus, most of our focus is on the transformed problem (9.8) which has a similar 
structure to problem (9.7). In this transformed problem one can think of hj as 
being the holding and shortage cost function, R t as being the revenue function 
and the variable ordering cost is equal to zero. 

For technical reasons, we need the following assumption on the revenue func- 
tions, and the holding and shortage cost functions. 

Assumption 9.3.1 For t = 1,2, . . ., —h t is concave and H^(y,d) is well defined 
for any y and d G [d t ,dt\. Therefore Hf(y,d) is jointly concave in y and d and 
consequently, 

Qf(x) := max Hf(x,d) (9. 11) 

d t >d>d t 

is concave. Furthermore, we assume that for any t, 

lim Qj{x) = —oo. 

1 07 1 — >00 

Notice that one can think of Hf (y, d) as being the expected single period profit 
excluding the ordering cost for a given inventory level y and a selling price associ- 
ated with a given expected demand, and Qf (x) as being the maximum expected 
single period profit excluding the ordering cost for a given inventory level x by 
choosing the best selling price. 
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9.3.2 Symmetric K -Convex Functions 

To motivate the technique used for characterizing the optimal policies for the 
integrated inventory and pricing models, it is useful to relate our problem to the 
celebrated stochastic inventory control problem discussed in Chapter 8. In that 
problem demand is assumed to be exogenously determined, while here demand 
depends on price. Other assumptions regarding the framework of the model are 
similar to those made in Chapter 8. In order to prove that an (s, S) policy is optimal 
for the stochastic inventory models, we employed the concept of if -convexity. It 
is clear from Definition 8.3.1 that one significant difference between K- convexity 
and the traditional convexity is that (8.4) is not symmetric with respect to Xq and 
x\ and thus it cannot be trivially extended to multi-dimensional space. 

It turns out that this asymmetry is the main barrier when trying to identify 
the optimal policy to the integrated inventory and pricing problem with non- 
additive demand functions. Indeed, there exist counterexamples which show that 
the function <fi t is not necessarily K- concave and an (s, S) inventory policy is 
not necessarily optimal for the finite horizon model with multiplicative demand 
functions. This motivates the development of a new concept, the symmetric K- 
concave function, which allows us to characterize the optimal policy in the general 
demand case. 

However, under the additive demand model this concept is not needed. Indeed, 
we prove in Section 9.3.3 that, for additive demand functions, the function (j> t is 
AT-concave and hence the optimal policy for problem (9.8) is an (s,S, p) policy. 
Formally, in this policy, every period, t, the inventory policy is characterized by 
two parameters, the reorder point, s t . and the order-up-to level, S t . An order of 
size St — Xt is made at the beginning of period t if the initial inventory level at the 
beginning of the period, Xt, is smaller than s*. Otherwise, no order is placed. The 
selling price in period t, pt, is a function of the inventory level after an order was 
made. 

To characterize the optimal policy for the finite horizon models under general 
demand functions, we propose a weaker definition of A'-convexity, referred to as 
symmetric AT-convexity. 

Definition 9.3.2 A function f : 5ft™ — > 5ft is called symmetric K-convex for K > 
0, if for any Xq,Xi £ 3?" and X £ [0, 1], 

/(( 1 — X)xq + Axi) < (1 — X)f(xo) + Xf(xi) + max{A, 1 — A}A'. (9-12) 

A function f is called symmetric K-concave if —f is symmetric K-convex. 

Observe that similar to the concept of convexity, the symmetric A'-convexity is 
defined in a multi-dimensional space while the A'-convexity is only defined in one 
dimensional space. Moreover, a AT-convex function is a symmetric AT-convex func- 
tion. The following results describe properties of symmetric A'-convex functions, 
properties that are parallel to those summarized in Lemma 8.3.2 and Proposition 
8.3.3. 
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Lemma 9.3.3 (a) A real-valued convex function is also symmetric 0-convex and 

hence symmetric K- convex for all K > 0. A symmetric Ki~ convex function 
is also a symmetric A' 2 -convex function for K\ < iv 2 - 

(b) If gi(y) and <72 (j/) are symmetric Ki-convex and symmetric A^-convex re- 
spectively , then for a,/3 > 0, ag±(y) + /3<jr 2 (y) is symmetric (aA'i + /3A' 2 )- 
convex. 

(c) If g(y) is symmetric K -convex and £ is a random variable, then E^[g(y — £)] 
is also symmetric K -convex, provided E[\g(y — £)|] <00 for all y. 

(d) Assume that g : 3? — > 3? is a continuous symmetric K -convex function and 
g{y) — > 00 as \y\ — > 00 . Let S be a global minimizer of g and s be any element 
from the set 

X := {x|x < S,g(x) = g(S) + I\ and g{ x') > g( x) for any x ' < x}. 
Then we have the following results. 

(i) g(s) = g(S) + K and g(y) > g(s ) for all y < s. 

(ii) g(y) < g{z) + K for all y , z with (s + S)/2 < y < z. 

Proof. Parts (a),(b) and (c) follow directly from the definition of symmetric K- 
convexity. Hence we focus on part (d). Since g is continuous and g(y) — > 00 as 
\y\ — > 00 , X is not empty. Part (d)(i) is a direct consequence of the fact that 
s € X. 

To prove part (d) (ii) we consider two cases. First, for any y , z with S < y < z, 
there exists A £ [0, 1] such that y = (1 — A )S + A z, and we have from the definition 
of symmetric A'-convexity that 

g(y) < (1 - A )g(S) + A g{z) + max{A, 1 - A} A" < g(z) + K, 

where the second inequality follows from the fact that S minimizes g(x). 

In the second case, consider y such that S > y > (s + S)/ 2. In this case, there 
exists 1 > A > 1/2 such that y = (1 — A)s+AS' and from the definition of symmetric 
AT-convexity we have that 

g(y) < (1 - a )g(s) + A g(S) + XK = g(S) + K < g(z) + K, 

since g(s) = g(S) + K. Hence (i) and (ii) hold. I 

Figure 9.1 provides an illustration of the property of a symmetric A- convex 
function in Lemma 9.3.3 part (d). Notice that there might exist a set A C (s, (s + 
S)/ 2) such that g(x) > g(S) + I\ for x € A. 

We now present another important property of symmetric A'-convex functions, 
which allows us to prove the symmetric A'-concavity of the functions Vf(x) and 
gt(y,d) by dynamic programming backward induction. 
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FIGURE 9.1. Illustration of the Properties of a symmetric A'-convex Function 



Proposition 9.3.4 If f : 3? — >■ 5ft is a symmetric K -convex function, then the 
function 

g{x) = min QS(x - y) + f(y) 

y<x 

is symmetric max{A', Q}-convex. Similarly, the function 

h(x) = min QS(x - y) + f(y) 

y>x 



is also symmetric max{ K,Q}-convex. 

Proof. We only need to prove the symmetric max{AT, Q}-convexity of function 
g(x). The second part of the result follows from the symmetric property of the 
symmetric Af-convexity. 

If Q > AT, we know that f(x) is also a symmetric Q-convex function by Lemma 
9.3.3 part (a). Hence it suffices to prove that in the case AT > Q , the symmetric K- 
convexity of the function f{x) implies the symmetric AT-convexity of the function 
g{x). Thus, in the remaining part of the proof, we assume that K > Q. 

Observe that g(x) < f(x) for any x and g(x) < Q + f(y) for any y < x. Let 
E = {x | g(x) = f(x)} and R = {x \ g(x) < f(x)}. We want to show that for any 
Xq,xi and A £ [0, 1] with xq < xi, 

g(x a) < (1 - A)s(a"o) + \g(x\) + max{A, 1 - A} AT, (9.13) 

where x\ = (1 — A)xo + Axi. We will consider four different cases. 

Case 1: £ 0 , 2:1 £ E. In this case, 

g{x\) < f{x a ) 

< (1 — A)/(xo) + A/(x 1 ) + max{A, 1 — A} A' 

= (1 — X)g{xo) + \g{xi) + max{A, 1 — A} A', 

where the second inequality follows from the symmetric AT-convexity of the func- 
tion f{x). 
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Case 2: xq,X\ € R. In this case, let g(xi ) = Q + f{yi) for i = 0, 1 with y-i < Xi 
and let y\ = (1 — A)t/o + Aj/i- It is clear that t/o < 2/i and 2/a < x \- Furthermore, 

g(x\) < Q + f{y \ ) 

< (1 — A)(Q + f(yo)) + X(Q + f(yi)) + max{A, 1 — A}Ii 

= (1 — A)g(a;o) + Xg(xi) + max{A, 1 — A }K, 

where the second inequality follows from the symmetric K- convexity of the func- 
tion f(x). 

Case 3: xq € R, x\ € E. Let g{xa) = Q + f{yo) with y 0 < Xq. We will distinguish 
between two cases. 

Subcase 1: f(y 0 ) — f(x i) < K — Q. In this case, 
g{x\) < Q + f(y 0 ) 

= (1 - A )(Q + f(y 0 )) + A f( Xl ) + A (Q + f(y 0 ) - /(an)) 

< (1 — X)g{xo) + Xg(xi) + XK. 

Subcase 2: f(y 0 ) — f(x i) > K — Q. Let ai* = (1 — /z)z/ 0 + /aaq with A < g. Then 

g{ x\) < f{x a) 

< (1 - Li)f{yo) + /xf(x i) + max{/z, 1 - g}I< 

= (1 - X)g{x 0 ) + As'(aq) + max{fi, 1 - g}K 

+ (M- X)(f(x!) - f(y 0 )) - (1 - X)Q 

< (1 — A)g(a:o) + Xg(xi) + max{/z, 1 — g}K — (1 — g)Q — (g — A )K 

< (1 — X)g(xo) + Xg(xi) + max{A, 1 — A}A', 

where the second inequality follows from the symmetric /^-convexity of the func- 
tion /( x) and the third inequality follows from the assumption that f(yo) — f(xi) > 
K -Q. 

Case 4: Xq € E,x\ € R. Let g(x i) = Q + f{yi) for y\ < x\. Again, we distinguish 
between two different cases. 

Subcase 1: z/i < x\. In this case, 

g(x a) < Q + f(yi) 

= (1 - A )/(*„) + A (Q + + (1 - X)(Q + f( yi ) - f(x o)) 

< (1 - X)g(xo) + Ag(aq) + (1 - A)Q, 

where the last inequality holds since f(y i) < f{x o). 

Subcase 2: z/i > x\. Let x\ = (1 — g)x^ + gy\ with A < g. Then 

g{x a) < f(x a) 

< (1 - g)f(x 0 ) + gf{yi) + max{/z, 1 - g }I\ 

= {l - X)g(x 0 ) + Xg(xx) +max{g,l - g}K 
+ (M - X)( f(yi) - f(x o)) - A Q, 



(9.14) 
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where the second inequality follows from the symmetric K- convexity of the func- 
tion /( x). On the other hand, since Xq < X\, 

g(x\) < Q + f{x 0 ) 

= (1 - X)g(x 0 ) + Xg(xx) (9.15) 

+ A(/(*o) - f{yi)) + (1 - A )Q. 

If /i < , then inequality (9.14) implies inequality (9.13) since max{/z, 1 — fi] = 
1 — M < 1 — A and f(yx) < f(x 0 ). 

Now assume that fi > Multiplying (9.14) by A / fi and (9.15) by (/i — X)/fi and 
adding them together, we have 

g{x a) < (1 - X)g{x 0 ) + Xg{xi) + A K - (- - (1 - A ))Q. (9.16) 

h 

If A > g, then A _ (i _ A) >0, which, together with inequality (9.16), implies 
(9.13). On the other hand, if A < A, we have that 

A AT - (- - (1 - A))Q = (1 - A )K + (2A - 1 ){K - Q) + AQ(1 - -) < (1 - A )AT, 

h fi 

which, together with inequality (9.16), implies (9.13). I 

In the following, we show that, like convex functions, the symmetric A'-convexity 
can be preserved under optimization operations. 

Lemma 9.3.5 Let /(•,•) : $t n x 3? m — > 3? be symmetric K-convex. Assume that 
for a given x £ 3?", there is an associated set C{x) C ?ft m and 

C := {( x,y ) | y £ C(x),x £ 3?”} 

is convex. Furthermore, assume that 

g(x)= min f(x,y) 

yeC(x) 

is well defined and the minimization is attainable for any x. Then g is symmetric 
K-convex. 

Proof. For any Xq,x\ € 3?" and A € [0, 1], let yo G C( xq) and yt £ C(x i) such 
that g(x 0 ) = f(x 0 ,y 0 ) and g(x i) = f(x 1 , 3 / 1 ). Then 

(1 — X)yo + Xy\ £ C'((l — A)a’o + Aaq), 

and 

5((1 - A)x 0 + Aa?i) < /((l - A)ar 0 + Axi,(l - A)y 0 + Aj/i) 

< (1 - A )/(x 0 , j/o ) + A/(#i, 2 / 1 ) + max{A, 1 - A }K 

= (1 — X)g(xo) + Xg(xi) + max{A, 1 — A} A". 
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Therefore g is symmetric A'-convex. I 

In the following we focus on characterizing the optimal solution for the finite 
horizon model. Specifically, our objective is to identify a pricing and replenishment 
policies that solve (9.7) or its equivalent (9.8). 

It turns out that in this case the optimal policy for the additive demand model 
is significantly different from the optimal policy for the general demand case. In 
particular, we show, in Section 9.3.3, that when the demand function is additive, 
the function <j> t is AT-concave for any t and hence an (s, S, p) policy is optimal. For 
more general demand functions, i.e., multiplicative plus additive functions, the 
function fit is not necessarily AT-concave and an (s, S, p) policy is not necessarily 
optimal. Indeed, in this case we show, in Section 9.3.4, that fit is symmetric k- 
concave, which allows us to characterize the optimal policy for the general demand 
model. Finally, in Section 9.3.5, we show that our results imply that in the special 
case with zero fixed cost and general demand functions, a base stock list price 
policy is optimal. 



9.3.3 Additive Demand Functions 

In the additive demand model the demand function is assumed to be of the form 

dt = D t {pt) + fit, 



where fit is a random variable. 

Observe that a special case of this demand function is the additive linear demand 
function in which dt = b t — at.pt + fit with bt, a* > 0 for t = 1,2 , ... ,T. 

In the following, we show, by induction, that gtfiy, dt(y)) is a AT-concave function 
of y and fitfix) is a AT-concave function of x. Therefore, the optimality of an (s, S, p) 
policy follows directly from Lemma 8.3.2. 

To prove that gt.(y,dt(y)) is a AT-concave function of y, we need the following 
lemma. 

Lemma 9.3.6 Suppose that gt(y,d) is jointly continuous in ( y,d ). Then, there 
exists a dfiy) which maximizes (9.10) such that y — dfiy) is a non- decreasing 
function ofy. 

Proof. Define 

9 t(y,d) := g t {y,y- d) = R t (y - d) - c t y + E[-h t (d - fit) + v t+1 (d - fit)]. 

Then, Assumption 9.2.1, together with Theorem 2.3.6, implies that function gt(y, d) 
is supermodular. The lemma thus follows Theorem from 2.3.7. I 

The lemma thus implies that the higher the inventory level at the beginning of 
time period t, yt, the higher the expected inventory level at the end of period t, 
y t — dtfyt). We are now ready to prove our main results for the additive demand 
model. 
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Theorem 9.3.7 (a) For any t = 1,2 gt(y,d) is jointly continuous in 

(y, d) and hence for any fixed y, g t (y , d) has a finite maximizer d t {y) which 
satisfies Lemma 9.3.6. Furthermore, 

lim g t (y,d) = — oo for any d € [d t ,dt\ uniformly . 

\y\ >oo 

(b) For any t = 1, 2, . . . , T, g t (y , dtfy)) and <f>t{x) are K-concave. 

(c) For any t = 1,2, ... ,T, there exist St and St with St < St such that it 

is optimal to order St — Xt and set the selling price ptfxt) = Df 1 (dt{St)) 

when the initial inventory level Xt < St, and not to order anything and set 
Pt(x t ) = Df 1 (d t (x t )) when x t >s t . 

Proof. By induction. For period T, part (a) directly follows from Assumption 
9.3.1. Parts (b) and (c) hold since gr{y,dT(;y)) is concave. 

Assume parts (a),(b) and (c) holds for t + 1. From part (c) and the continuity 

of gt+i(y,d), 

i / \ _ f — K + g t+ i(S t+ i, d t+ i(St + i)), if x < s t+ i, 

n+i{ - | g t+l (x,d t+1 {x)), \i x> s t+1 , 

which implies that <ft+i(x) is continuous and hence gt.(y,d) is continuous in ( y,d ). 
Thus, for any fixed y, g t (y,d) has a finite maximizer dtfy ) which satisfies Lemma 
9.3.6. Part (c) also implies that 

Ei^t+iiy -d- (3 1)} < (j> t+ i(S t +i) 

for any (y, d) and hence limiyi^oo g t (y,d) = — oo for any d £ [d t ,dt\ uniformly by 
Assumption 9.3.1. Therefore, part (a) holds for period t. 

We now focus on part (b). We show that gt{y,d t {y)) and (ftfx) are A'-concave 
based on the assumption that (j> t +i{x ) is AT-concave. 

For any y < y' , and A £ [0, 1], we have, from Lemma 9.3.6 and the assumption 
that (ft+i is AT-concave, that 

<t>t+ i((l - A )(y - d t {y) - (3 t ) + Kv' ~ Mv') - fit)) 

> (i - ^)4>t+i{y - d t {y) - (3 t ) + Mt+iW - d t(y') - Pt) - aat. 

In addition, the concavity of Hf ( x , d) implies that 

Hf{{ 1 - \)y + \y', (1 - A )d t (y) + A d t {y')) > (1 - A )H?(y, d t (y)) + A Hf(y', d t (y')). 

Adding the last two inequalities and taking expectation, we get 

g t ((l-X)y + \y',(l-\)dt(y)+\d t (y')) > {l-X)gt(y,d t (y)) + Xg t {y',d t (y'))-X-yK. 

From the definition of d t (( 1 — A )y + X y'), we have 
9t(( 1 - A )y + X y', d t {{ 1 - A )y + Ay')) > 9t{0- - X )y + A y' , (1 - A )d t (y) + X d t {y')), 
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and hence, 

9t{{ 1 - A )y + A V, d t ((l - A)y + Ay')) > (1 - A )g t (y, d t (y)) + A g t (y', d t (y’)) - AyAT, 

that is, g t (y, dt(y)) is a 7 A'-concave function of y, and hence A'-concave by Lemma 
8.3.2 part (a). Furthermore, Proposition 8.3.3 implies that /fit is A'-concave. Hence 
part (b) holds for period t. 

We now prove part (c). Since gt(y,dt(y)) is A'-concave, Lemma 8.3.2 part (d) 
implies that there exists s t and S t , such that S t maximizes gt(y^d t (y)) and s t is 
the smallest value of y such that gt(St,d t (S t )) = g t (y,d t (y)) + AT, and 

, / x = f ~K + g t (S t , dt(S t )), if x < s t 

* \ g t (x,d t (x)), \ix>s t . 

Hence part (c) holds. I 

Thus, Theorem 9.3.7 implies that an (s, S, p) policy is optimal when the demand 
is additive. An interesting question is whether a list price policy is optimal, as is 
the case for the single period model with no fixed cost. Unfortunately, this property 
does not hold for the finite horizon model as illustrated by Chen and Simchi-Levi 
(2002a). 

9.3.4 General Demand Functions 

In this section, we focus on the model with general demand functions (9.1). Observe 
that the additive demand function analyzed in the previous section is a special case 
of the general demand function (9.1). More importantly, multiplicative demand 
functions of the form d t = a t D t (p) where D t (p) = a t p~ bt (a t > 0 ,b t > 1), or 
demand functions of the form dt = fit + ott{bt — atp) (at > 0 ,bt > 0), are also 
special cases. 

To characterize the optimal policy for the model with the demand functions 
(9.1), one might consider using the same approach applied in Section 9.3.3. Unfor- 
tunately, in this case, the function y — a.tdt(y) is not necessarily a non-decreasing 
function of y for all possible a t , as is the case for additive demand functions. 
Hence, the approach employed in Section 9.3.3 does not work in this case. In fact, 
as demonstrated in Chen and Simchi-Levi (2002a), the function g t (y,dt(y)) and 
4>t(x) are in general not A'-concave and an (s,5, p) policy is not necessarily opti- 
mal. 

To overcome these difficulties, we apply the concept of symmetric A'-convexity 
introduced in Section 9.3.2. Specifically, in the following, we show, by induction, 
that gt ( y , d) is a symmetric A'-concave function of (y, d) and </>t ( x ) is a symmetric 
Af-concave function of x. Hence a characterization of the optimal pricing and 
ordering policies follows from Lemma 9.3.3. 

Theorem 9.3.8 (a) For any t, gt(y , d) is continuous in (y, d) and hence for any 

fixed y, gt(y,d) has a finite maximizer dt(y). Furthermore, 

lim g t (y,d) = — 00 for any d £ [d t ,dt] uniformly . 

|y| — i-oo 
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(b) For any t = 1, 2, . . , i T, g t (y , d) and <j>t{x) are symmetric K-concave. 

(c) For any t = 1,2, ... ,T, there exist St and St with St < St and a set A t C 
[s t , (st + t)/ 2] such that it is optimal to order S t — x t and set the selling price 
Pt = Pt(St) when the initial inventory level x t < s t or x t £ A t , and not to 
order anything and set pt = Pt(xt) otherwise. 

Proof. The proof of part (a) is similar to the proof of part (a) in Theorem 9.3.7. 
We now focus on part (b). 

By induction. cj>T+i(x) = 0 is symmetric 0-concave. From the symmetric K- 
concavity of <j> t +i(x), we have that E[<fit+i{y — a t d — /3 t )} is symmetric A'-concave. 
Also, we have that H/(y,d) is concave by Assumption 9.3.1. Hence, g t (y,d) is 
symmetric yAT-concave and hence by Lemma 9.3.5, the function gt(y,dt(y)) is 
symmetric yAT-concave. Finally gt{y,dt.{y)) is symmetric AWoncave by Lemma 
9.3.3 part (a) and the symmetric AT-concavity of (f>t{x) follows from Proposition 
9.3.4. Thus, part (b) holds. 

We now prove part (c). From Lemma 9.3.3 part (d) we have 

< / \ _ f + 9t(Sti dt(S t )), if x £ I t 
^ _ \ g t (x,d t (x)), if X ^ It, 

where S t is the maximizer of gt{y, dt(y)) and 

h = {y<S t | g t (y, d t (y)) < g t (S t , d t (S t )) - A'}. 

Furthermore, (j>t{x) > g t (x, d t (x)) for any x and <j>t(x) > —K + g t (S t , d t (S t )) for 
any x < S t . 

Let St be defined as the smallest value of y such that gt(St, d t (S t )) = gt{y, d t ,{y))+ 
K. Note that from Lemma 9.3.3 part (d), (— oo,s t ] C It and [(s t + S t )/2, oo) C 
( I t ) c , the complement of J t . Part (c) follows from Lemma 9.3.3 and part (b) by 
defining 

At = It n [s t , ( s t + St)/ 2]. 

I 

Theorem 9.3.8 thus implies that the optimal policy for problem (9.7) is an 
(s, S, A, p) policy. Such a policy is characterized by two parameters s t and S t and 
a set A t C [st, (s t + St)/ 2], possibly empty. When the inventory level x t at the 
beginning of the period t is less than s t or x t is in the set A t , an order of size S t —x t 
is made. Otherwise, no order is placed. Thus, it is possible that an order will be 
placed when the inventory level Xt £ [st, (-St + St)/ 2], depending on the problem 
instance. In any case, if an order is placed, it is always to raise the inventory level 
to S t . 



9.3.5 Special Case: Zero Fixed Ordering Cost 

We now apply our results to the zero fixed cost case. 
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Corollary 9.3.9 Consider our model with zero fixed ordering cost and general 
demand functions (9.1). In this case, a base stock list price policy is optimal. 

Proof. By Theorem 9.3.8, the functions and g t (y,d t (y)), t = 1,2, 

are symmetric 0-concave and hence, from Definition 9.3.2, they are concave. The 
optimality of the base-stock inventory policy directly follows from the concavity 
of gt(y,d t (y)) for t = 1,2, ...,T. 

We now show that dt (y) is non-decreasing and therefore the optimal price pt ( y ) is 
non-increasing. In fact, in the zero fixed ordering cost case, Theorem 2.3.6 implies 
that gt(y,d) is supermodular. Therefore, from Theorem 2.3.7, there exists dtfy) 
which is non-decreasing. I 



9.3.6 Extensions and Challenges 

In the previous subsection, we show by employing the classic concept of K- 
convexity that an (s,S, p) policy is optimal for the additive demand case. By 
using a weaker concept of symmetric K- convexity, we show that an (s, S , A, p) 
policy is optimal for the general demand case. Thus, it is a natural conjecture 
that a stationary (s, S, A , p) policy is optimal for a corresponding infinite horizon 
model with stationary parameters and general demand functions. However, sur- 
prisingly, as shown by Chen and Simchi-Levi (2002b), a stationary (s, S, p) policy 
is optimal for the infinite horizon model under either the discounted profit or the 
average profit criterion, whose proof is also based on the concept of symmetric 
KT-concavity. Table 9.3.6 is a summary of structural results and concepts used for 
analyzing the inventory (and pricing) models. 





Inventory Model 


Joint Inventory and Pricing Model 


No Fixed 
Ordering 
Cost 


Base Stock 
Policy 
(Convexity) 


Base Stock List Price Policy 
(Concavity) 


Fixed 

Ordering 

Cost 


(s,S) Policy 
(K-Convexity) 


Finite Horizon Case 


Infinite 

Horizon 

Case 


additive 

demand 


general 

demand 


(s.SVp) 

Policy 

(K-Concavity) 


{s,S,A,p) 

Policy 

(Symmetric 

K-Concavity) 


(s,S. p) 
Policy 
(Symmetric 
K-Concavity) 



TABLE 9.1. Summary of Results and Tools for the Inventory (and Pricing) Problems 

Of course, it is appropriate to point out that all our results in this chapter may 
not hold for problems with discrete prices (see Chen (2003)). Indeed, if price is 
restricted to take values from a discrete set, even the single period profit function 
may not be concave and our analysis does not work anymore. This fact imposes 
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a significant challenge for solving the integrated inventory and pricing models, 
since in order to solve these models, one has to use discrete inventory levels and 
discrete prices. Thus, a natural question is whether one can design efficient algo- 
rithms by employing the structural results of optimal policy identified in previous 
subsections. 

Another challenge for the integrated inventory and pricing models analyzed in 
this section is zero lead time assumption. This is not the case for the standard 
inventory control problems. In fact, for the standard stochastic inventory models, 
the structural results of the optimal policies can be generally extended to models 
with deterministic lead time, as we pointed our at the end of Section 8.5. The idea 
is to transfer a model with positive lead time to one with a similar structure while 
with zero lead time. However, this technique is not valid here, since for our models 
with positive lead time, the two decisions, the ordering decision and the pricing 
decision, will take effects at different times. 



9.4 Risk Averse Inventory Models 

All the inventory (and pricing) models discussed so far focus on risk neutral deci- 
sion makers, i.e., inventory mangers that are insensitive to profit variations. Ev- 
idently, not all inventory managers are risk neutral; many planners are willing 
to tradeoff lower expected profit for downside protection against possible losses. 
Indeed, experimental evidence suggests that for some products, the so-called high- 
profit products, decision makers are risk averse; see Schweitzer and Cachon (2000) 
for more details. Unfortunately, traditional inventory control models fail short of 
meeting the needs of risk averse planners. For instance, traditional inventory mod- 
els do not suggest mechanisms to reduce the chance of unfavorable profit levels. 
Thus, it is important to incorporate the notions of risk aversion in a broad class 
of inventory models. 

The literature on risk averse inventory models is quite limited and mainly focuses 
on single period problems or based on mean-variance tradeoffs. For instance, Lau 
(1980) analyzes the classical newsvendor model, in which he maximizes the decision 
maker’s expected utility of total profit or the probability of achieving a certain level 
of profit. Eecklroudt, Gollier and Schlesinger (1995) focus on the impact of risk and 
risk aversion in the newsvendor model when risk is measured by expected utility 
functions. 

Chen and Federgruen (2000) analyze the mean-variance tradeoffs in newsven- 
dor models as well as some standard infinite horizon inventory models. Specifically, 
in the infinite horizon models, Chen and Federgruen focus on the mean-variance 
tradeoff of customer waiting time as well as the mean-variance tradeoffs of inven- 
tory levels. Martfnez-de-Albeniz and Simchi-Levi (2003) study the mean-variance 
tradeoffs faced by a manufacturer signing a portfolio of option contracts with its 
suppliers and having access to a spot market. 

Assuming a linear ordering cost, Bouakiz and Sobel (1992) minimize the ex- 
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pected exponential utility of the present value of costs over a finite planning hori- 
zon or an infinite horizon. In particular, they show that a base stock policy is 
optimal. 

So far all the papers referenced above assume that demand is exogenous. A rare 
exception is Agrawal and Seshadri (2000) who consider a risk averse retailer which 
has to decide on its ordering quantity and selling price for a single period. They 
demonstrate that different assumptions on the demand-price function may lead to 
different properties of the selling price. 

In this section, we discuss a general framework for incorporating risk aversion 
in multi-period inventory (and pricing) models, in which risk is measured based 
on increasing and concave utility functions. Our analysis is based on Chen, Sim, 
Simchi-Levi and Sun (2004). 

The assumptions made in the risk averse models are similar to those in the 
joint inventory and pricing models analyzed in Section 9.3. One exception is that 
demand is a linear function of the selling price, i.e., D t (p ) is a linear function of p. 
More importantly, the objective of the risk averse decision maker is to maximize 
the expected utility of the total discounted profit over the planning horizon. That 
is, the objective is to maximize 



E[u(V?( x))] (9.17) 

for any initial inventory level x and any given 0 < 7 < 1, where u(-) is a utility 
function and Vi}{x)) is defined in (9.5). 

We require the utility function, u(x), to be increasing so that more is always 
preferred over less. Of course, if u(x) is a linear and increasing function, the model 
(9.17) yields the same optimal solution as the risk neutral model of (9.6). We 
also assume that the utility function is concave so that the marginal satisfaction 
of gaining a dollar is never more than the marginal loss of satisfaction associated 
with losing the same amount of money. It is appropriate to point out that expected 
utility theory is widely used in microeconomics and finance literature. 

In the next subsection, we discuss the risk averse framework based on a general 
increasing and concave utility function. This is followed by a subsection on models 
based on an important special case, the exponential utility. 



9-4-1 Expected utility risk averse models 

Unlike the risk neutral models analyzed in Section 9.3, the objective function 
(9.17) in its current form appears not to be decomposable and are not amenable 
to the dynamic programming approach. To deal with this issue, we introduce a new 
variable w to denote the wealth accumulated from the beginning of the planning 
horizon up to the current period. Thus, the state of the problem at period t can 
now be modelled as the inventory level Xt and the accumulated wealth from period 
T to period t, wt- 

Consider the expected utility measure. Let W t (x,w) be the maximum utility 
achievable starting at the beginning of period t with an initial inventory level x 
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and an accumulated wealth w. The dynamic program can be written as follows. 
Let 

W T +i{x,w) = u(w). 

and for t = 1,2, . . . , T, 

W t (x,w)= max E[W t+1 (x + ,w + )\, (9.18) 

y>x,p t >p>p t 

where 

x+=y - D t (p, e t ), 

and 

w + =w + 'y t ~ 1 (~KS(y - x) - c t (y - x) +pD t (p,e t ) - h t (y - D t (p,e t )). (9.19) 

We would like to emphasize that in this section, D t (p , et) is linear in p. Also notice 
that here we assume, without loss of generality, that Wt+i{x,w) is independent 
of x, which implies zero salvage value. Finally, we have 

max E[u{Vfl(x))] = W\(x, 0). 

Instead of working with the dynamic program (9.18), we find that it is more 
convenient to work with an equivalent formulation. Let 

U t (x,w ) =W t (x,w-'y t ~ 1 c t x). 

The dynamic program (9.18) becomes 

U t (x,w) = max E[U t+ i(x + ,w + )\, (9.20) 

y>x,p t >p>p t 

where 

w + = w + 'y t ~ 1 (—KS(y - x) + f t (y,p, e t )), 

and 

ft(y,P,£t ) = -(Q -'yc t+ i)y+ (p - jc t+1 )D t (p,e t ) - h t (y - D t (p,e t )). (9.21) 

We have the following observation, which can be easily verified by induction. 

Lemma 9.4.1 For any period t and fixed x, Ut(x,w) is increasing in w. 

Interestingly, this observation allows us to show that a wealth dependent base 
stock inventory policy is optimal when there is zero fixed ordering cost. 



Theorem 9.4.2 Assume that K = 0. In this case, Ut(x,w) is jointly concave in 
x and w for any period t. Furthermore, a wealth dependent base stock inventory 
policy is optimal for the risk averse inventory (and pricing) problem (9.17). 




9.4 Risk Averse Inventory Models 161 



Proof. We prove by induction. Obviously, Ut+i(x,w) is jointly concave in x and 
w. Assume that TJ t+ \(x,w) is jointly concave in x and w. We now prove that a 
wealth dependent base stock inventory policy is optimal and U t (x,w) is jointly 
concave in x and w. 

First, notice that for any realization of e t , f t is jointly concave in (y,p), which 
implies that w+ is jointly concave in (w,x,y,p). 

Since x+ is a linear function of ( y,p ) and w+ is jointly concave in (w,x,y,p), 
Lemma 9.4.1 allows us to show that U t+ \{x +1 w + ) is jointly concave in (w, x, y,p). 
This implies that E[Ut+\(x+, w+)] is jointly concave in ( w,x,y,p ). 

We now prove that a w-dependent base stock inventory policy is optimal. Let 
y*(w ) be an optimal solution for the problem 

max < max E[Ut+i(x+, tu+)] > . 

V I Pt>P>P t j 

Since E[Ut+i(x+, «;+)] is concave in y for any fixed w, it is optimal to order up 
to y*(w) when x < y*(w ) and not to order otherwise. In other words, a state 
dependent base stock inventory policy is optimal. 

Finally, according to Proposition 2.2.15, Ut(x,w) is jointly concave. I 

Recall that in the case of a risk neutral decision maker, a base stock list price 
policy is optimal. Theorem 9.4.2 thus implies that in the case of an increasing 
concave utility risk measure, the optimal policy is quite different. Indeed, in these 
cases, the base stock level depends on the total profit accumulated from the be- 
ginning of the planning horizon and it is not clear whether a list price policy is 
optimal. 

Stronger results exist for models based on the exponential utility risk measure, 
as is demonstrated in the next subsection. 

9-4-2 Exponential utility risk averse models 

We now focus on exponential utility functions of the form u(w) = b(l — exp(—w/b)) 
with parameter b > 0. The beauty of exponential utility functions is that we can 
essentially separate x and w as is illustrated in the next theorem. 

Theorem 9.4.3 For any time period t, there exists a function G t fx) such that 

U t (x, w ) = u(w + 'y t ~ 1 G t (x)). 

Proof. We prove by induction. For t = T + 1, Gt+i(x) = 0 for any x. Assume 
that there exists a function G t +i(x) such that 

U t +i(x, w) = u(w + y t G t + i(x)). 

From the recursion (9.18), we have that 

U t (x, w) = ma,x y > Xy p t > p >p t bE[l-exp(-(w + +'y t G t+ i(y- D t (p,e t ))/b)] 

= b- bexp(-w/b) min v > x ,p t >p>p t exp(y t ^ 1 /b(K6(y - x) - L t (y,p)/b)) 
= u(w + y t ~ 1 G t (x)), 
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where 



L t (y,p) = -b/ 7* 1 In (_E[exp(- 7 * 1 (f t (y,p,e t ) + jG t+1 (y - D t (p,e t 



and 



G t {x) = max -K5(y - x) + L t (y,p). 

y>x,p t >p>p 



(9.22) 



Thus the result is true. I 

The theorem thus implies that the optimal policy is independent of the accu- 
mulated wealth when exponential utility functions are used, which significantly 
simplifies the problem. In fact, the optimal policy can be found by solving prob- 
lem (9.22). Furthermore, this theorem, together with Theorem 9.4.2, implies that 
when there is zero fixed ordering cost, a base stock inventory policy is optimal un- 
der the exponential utility risk criterion independent of whether price is a decision 
variable or not. 

Before we present our main result for the problem with K > 0, recall the famous 
Holder inequality. 

Theorem 9.4.4 Assume p, q > 0 with 1 /p+ 1 /q =1. If f and g are continuous 
functions on 3? with J( R \f(x)\ p dx < oo and f K \g{x)\ q d(x) < oo, then 



Jj\f{x)g(x)\dx< (^jjf(x)\ p dx^j ^Jjf(x)\ q dx^j 

An important corollary of the Holder inequality is as follows. 

Theorem 9.4.5 If a function f is convex, I\ -convex or symmetric I\ -convex, then 
the function 

g(x) = ln(E[exp(/(x - 0)]) 

is also convex, K-convex or symmetric K-convex respectively. 

Proof. We only prove the case with K -convexity; the other two cases can be proven 
by following similar steps. 

Define M(x) = E[exp(f(x — £))]. It suffices to prove that for any Xq,x\ with 
Xo < Xi and any A £ [0, 1], 

M(x a) < M(xo) 1 ~ x M(xi) x exp(XK), 

where x\ = (1 — A)xo + Xx\. Notice that 

M(x\) < -E[exp((l - A)/(x 0 - 0 + A/(xi - 0 + XK)] 

= exp (XK)E [exp ( ( 1 - A )f(x 0 - 0 ) exp(A/(xi - 0)] 

< exp (A K) E [exp (f(x 0 - f))] 1 ~ x E[exp(f(x 1 - ^))] A 
= M(xo) 1 ~ x M(xi) x exp(XK), 

where the first inequality holds since / is AT-convex and the second inequality 
follows from the Holder inequality with 1/p = 1 — A and 1/q = X. I 

We can now present the optimal policy for the risk averse multi-period inventory 
(and pricing) problem with exponential utility function. 
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Price Not a Decision 


Price is a Decision 


K = 0 


I< > 0 


A' = 0 


K > 0 


Risk Neutral 
Model 


base stock 


(s,S) 


base stock 
list price 


{s,S,A, p) 


Exponential 

Utility 


base stock 


(s,S) 


base stock 


{s,S,A, p) 


Increasing & 
Concave Utility 


wealth dependent 
base stock 


? 


wealth dependent 
base stock 


? 



TABLE 9.2. Summary of Results for Risk Neutral and Risk Averse Models 

Theorem 9.4.6 (a) If price is not a decision variable (i.e., p = pt. for each t), 

Gtfx ) and L t (y,p) are K -concave and an ( s,S ) inventory policy is optimal. 

(b) If price is a decision variable, Gt{x ) and L t (y,p) are symmetric K -concave 
and an (s, S, A, p) policy is optimal. 

Proof. We only provide a sketch of the proof; the complete proof is left as an 
exercise. The main idea of the proof is as follows: if Gt + i(x) is A'-concave when 
price is not a decision variable (or symmetric A'-concave when price is a decision 
variable), then, by Theorem 9.4.5, L t (y, p) is A'-concave (or symmetric AT-concave). 
The remaining parts follow directly from Lemma 8.3.2 and Proposition 8.3.3 for 
AT-concavity (or Lemma 9.3.3 and Proposition 9.3.4 for symmetric AT-concavity). 
I 

We observe the similarities and differences between the optimal policy under the 
exponential utility measure and the one under the risk neutral case. Indeed, when 
demand is exogenous, i.e., price is not a decision variable, an (s, S) inventory policy 
is optimal for the risk neutral case; see Theorem 8.3.4. Theorem 9.4.6 implies that 
this is also true under the exponential utility measure. Similarly, for the more 
general inventory and pricing problem, Theorem 9.3.8 implies that an (s, S, A, p) 
policy is optimal for the risk neutral case. Interestingly, this policy is also optimal 
for the exponential utility case. 

Of course, the results for the risk neutral case are a bit stronger. Indeed, if 
demand is additive, Theorem 9.3.7 suggests that an (s,S, p) policy is optimal. 
Unfortunately, it is not clear whether this result still holds for the risk averse 
inventory and pricing problem under exponential risk measure. 

The structural results of the optimal policies for the risk averse models as well 
as risk neutral models are summarized in Table 9.4.2. 
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Exercise 9.1. Prove Theorem 9.4.5 using Exercise 2.5. 
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Exercise 9.2. Complete the proof of Theorem 9.4.6. 

Exercise 9.3. Recall the single period model analyzed in Section 9.2. We modify 
the model as follows. Instead of placing an emergency order to satisfy shortages, 
we assume that unsatisfied demand is lost. In this case, h(x) is the penalty cost 
for lost sales if x < 0. Show that the optimal selling for the additive demand case 
is no more than the one for the deterministic demand case, which in turns is no 
more than the one for the multiplicative demand case. 

Exercise 9.4. Building on the concept of symmetric AT-convexity, Ye and Duenyas 
(2003) introduce the concept of (K. Q)-convexity. A real- valued function / is called 
( AT , Q)-convex for K, Q > 0, if for any xq, x\ with xo < x\ and A € [0, 1], 

/((l— A)x 0 +Axi) < (l—\)f(x 0 )+\f(xi)+\K+(l—\)Q—mm{\, 1— A} min{AT, Q}. 

It is easy to see that ( K , 0)-convexity is exactly the A'-convexity and the ( K , K)- 
convexity is the symmetric AT-convexity. Prove the following. 

(a) A ( K , Q)-convex function is also (AT', Q'j -convex for K < K' and Q < Q'. A 
real-valued convex function is (0,0)-convex and hence (AT, Q)-convex for all 
K, Q > 0. 

(b) If gi(y) and 52 ( 2 /) are (K\, Qi) -convex and (AT 2 , Q 2 )-convex respectively, 
and (Ki — Qi)(AT 2 — Q 2 ) > 0, then for a, /3 > 0, agi(y ) + /3g2(y) is (aKi + 
(3K 2 ,aQi + /3 Q 2 )-c onvex. 

(c) If g(y) is {K 1 Q)-convex and w is a random variable, then E{g(y — w)} is 
also (A', Q)-convex, provided E{\g(y — w) \ } < 00 for all y. 

(d) Assume that g is a continuous (AT, Q)-convex function with K > Q and 
g(y) — * 00 as \y\ — > 00 . Define 

S = min{ x \ g(x) < g(y), for any y}, 

s = min{ x \ g{ x) = g{S) + A'}, 
s' = sup{cr | x < S,g(x') > g(S) + (AT — Q) for any x' < x}, 

and 

u = inf{a: x > .S', g{x') > g(S) + Q for all x' > a’}. 

Then s < s' < S < u and we have the following results. 

(i) g(s) = g(S) + K and g(y) > g(s) for all y < s. 

(ii) g(u) = g(S) + Q and g(y) > g(u ) for all y > u. 

(iii) g(y) < g{z) + Q for all y, z with z < y < s' . 

(iv) g(y) < g{z) + K for all y, z with s' < y < z. 

( v ) d(y) < d( z ) + K f° r 2 /) Z with (s + S)/2 < y < z. 
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Exercise 9.5. (Chen and Simchi-Levi (2003)) If a function / is ( K , Q)-convex, 
prove that the function 



g(x) = min QS(y - x) + f(y) 

y<x 

is also (K, Q)- convex, where S(x) = 1 for x > 0 and S(x) = 0 otherwise. Similarly, 

h(x) = min KS(y — x) + f{y) 

y>x 

is also ( K , Q)-convex. 

Exercise 9.6. (Chen and Simchi-Levi (2003)) Assume that / : 5ft — > 5ft is (K, Q)- 
convex. Prove that there exists a convex function f{x) such that 

f(x) < f(x) < f(x) + max{ii, Q}, for any x. 
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10 

Procurement Contracts 



10.1 Introduction 

The inventory models discussed in Chapter 8 focus on characterizing the optimal 
replenishment policy for a single facility given some assumptions, e.g., lead time, 
yield, of its supplier. This, of course, emphasizes the need, in many cases, to de- 
velop direct relationships with suppliers. These relationships can take many forms, 
both formal and informal, but often, to ensure adequate supplies and timely de- 
liveries, buyers and suppliers typically agree on supply contracts. These contracts 
address issues that arise between a buyer and a supplier, whether the buyer is a 
manufacturer purchasing raw materials from a supplier or a retailer purchasing 
manufactured goods from a manufacturer. In a supply contract, the buyer and 
supplier may agree on 

• Pricing and volume discounts. 

• Minimum and maximum purchase quantities. 

• Delivery lead times. 

• Product or material quality. 

• Product return policies. 

As we will see, supply contracts are very powerful tools that can be used for far 
more than to ensure adequate supply and demand for goods. 

To illustrate the importance and impact of different types of supply contracts on 
supply chain performance, consider a typical two-stage supply chain consisting of 
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a retailer and a supplier. In such a supply chain, the retailer places orders trying 
to maximize its own profit and the supplier reacts to the orders placed by the 
retailer. This process is referred to as a sequential supply chain since decisions are 
made sequentially. Thus, in a sequential supply chain each party determines its 
own course of action independent of the impact of its decisions on other parties; 
clearly, this cannot be an effective strategy for supply chain partners. 

It is natural to look for mechanisms that enable supply chain entities to move 
beyond this sequential process and toward global optimization. Of course, this 
maybe quite difficult since in a typical supply chain different parties may have 
different, sometimes even conflicting, objectives. Thus, it is important to identify 
mechanisms that maximize the efficiency of the supply chain while allowing dif- 
ferent parties to focus on their own objectives. One way to achieve this goal is to 
use contracts specifying the transactions between supply chain parties such that 
every party’s objective is aligned with the objective of the entire supply chain. We 
will refer to such a contract as a contract that coordinate the supply chain. 

To illustrate how supply contracts can be used to coordinate the supply chain, we 
investigate in this chapter a simplified supply chain consisting of two risk-neutral 
decision makers, a supplier and a retailer. The retailer faces uncertain demands 
and needs to procure a certain quantity of a single product from the supplier. 
The supplier then produces and delivers the order to the retailer before demand 
is realized. The two parties negotiate and form a contract regarding the terms of 
the transactions. 

A simple example of such a contract is the wholesale contract that we have seen 
in the analysis of the newsvendor problem, see Chapter 8, Section 8.2, in which 
the supplier specifies a wholesale price, while the retailer places an order to the 
supplier and the payment is proportional to the quantity purchased by the retailer. 
Unfortunately, as we will see in the next section, this simple wholesale contract 
does not coordinate the supply chain in general. 

Several supply contracts have been proposed to achieve system efficiencies. 
Among those contracts, the buy back contracts and the revenue sharing contracts 
are commonly used in some industries due to their effectiveness and simplicity. In 
fact, under the setting to be specified later on in this chapter, the two contracts 
coordinate the supply chain, that is, these contracts allow supply chain partners 
to achieve global optimization, i.e., maximize supply chain expected profit. 

Furthermore, in these contracts, the retailer’s optimal strategy, namely the op- 
timal ordering quantity, together with the supplier’s optimal strategy, namely the 
optimal cost parameters specified in the contracts, consists of a Nash equilibrium. 
Thus, neither the retailer nor the supplier could increase their profit by unilaterally 
deviating from their optimal strategies. 

Interestingly, the buy back contracts and the revenue sharing contracts are 
shown to be equivalent under our model setting. The literature on supply contracts 
that coordinate the supply chain system is quite extensive and is still expanding. 
We refer the reader to the review paper by Cachon (2002) for more details. 

Of course, effective supply contracts are not only important in the retail indus- 
try. In the electronics industry there has been a marked increase in purchasing 
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volume as a percentage of the firm’s total sales. For instance between 1998 and 
2000, outsourcing in the electronic industry increased from 15 percent of all com- 
ponents to 40 percent. This increase in the level of outsourcing implies that the 
procurement function becomes critical for an OEM (Original Equipment Manufac- 
turing) to remain in control of its destiny. As a result, many OEMs focus on closely 
collaborating with the suppliers of their strategic components. In some cases, this 
is done using effective supply contracts that try to coordinate the supply chain. 

A different approach has been applied by OEMs for non-strategic components. 
In this case, products can be purchased from a variety of suppliers and flexibility 
to market conditions is perceived as more important than a permanent relation- 
ship with the suppliers. Indeed, commodity products, e.g., electricity, computer 
memory, steel, oil, grain or cotton, are typically available from a large number of 
suppliers and can be purchased in spot markets. Because these are highly stan- 
dard products, switching from one supplier to another is not considered a major 
problem. 

Thus, in this chapter we also introduce and analyze portfolio contracts based on 
the recent work of Martfnez-de-Albeniz and Simhci-Levi (2002). In these contracts, 
the buyer signs a portfolio of supply contracts, contracts that provide the buyer 
with the appropriate trade-off between price and flexibility. 



10.2 Wholesale Contracts 

In a wholesale contract, the supplier specifies a wholesale price and in return, 
the retailer decides how much to order from the supplier. Specifically, when the 
retailer places an order, its payment to the supplier is proportional to the quantity 
it orders. Thus, in this case, the retailer is facing a newsvendor problem and chooses 
the optimal ordering quantity according to the newsvendor model we analyzed in 
Chapter 8 Section 8.2. Of course, the supplier anticipates the reaction of the the 
retailer and takes it into account when deciding its wholesale price. This is the 
so-called a Stackelberg game between the supplier and the retailer, in which the 
supplier is the leader and the retailer is the follower. 

The setting in this model is as follows. The retailer places an order from the 
supplier before the realization of the uncertain demand, and sells the product to 
its customers at a unit price r. Let F be the cumulative distribution function of 
the demand. The function F is assumed to be strictly increasing and differentiable. 
For simplicity, we assume that unsatisfied demand is lost and there is no penalty 
cost for lost sales. In addition, leftover inventory is salvaged with unit price v. 
Finally, we assume that the supplier has no production capacity limit and its unit 
production cost is c with v < c < r. 

Before proceeding to analyze the Stackelberg game between the supplier and 
the retailer, we first discuss the optimal production quantity of the entire system 
assuming that the supplier and the retailer belong to a centralized system. In this 
case, the objective is to maximize the system expected profit. Given the production 
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quantity q, the profit for the total supply chain is 

7 r°(g) = —cq + rE D [mm(q,D)]+vE D [max(q-D,0)] 

= (r — c)g — (r — u)i?i3[max(<7 — £), 0)]. 

This is exactly the classical newsvendor problem analyzed in Chapter 8 Section 
8.2. Thus, the optimal production quantity for the total supply chain is 




where F is the inverse function of the cumulative distribution function F. 

We now analyze the Stackelberg game between the supplier and the retailer. 
Assume for now that the unit wholesale price of the supplier is w. As we already 
noticed, the retailer is facing a newsvendor model. Again from the analysis of the 
newsvendor model, the optimal ordering quantity for the retailer can be determined 
as follows 

). ( 10 . 1 ) 

Of course, here we assume r > w > v to avoid trivial cases. Notice that q(w) > q° 
only if w < c. However, this implies that the supplier makes non-positive profit. 
Thus, the supplier prefers a higher wholesale price and in this case, the retailer 
always tends to order less than q°, the quantity that is optimal for the entire 
supply chain. We refer to this behavior as double marginization. Of course, 
this behavior has an intuitive explanation. Since the retailer bears all the risk for 
overstocking it has no incentive to order more and thus tries to reduce its risk 
exposure by reducing inventory levels. 

As we already pointed out, the supplier anticipates this behavior of the retailer 
when setting its wholesale price. From (10.1), there is a one to one correspondence 
between the optimal ordering quantity of the retailer and the wholesale price set 
by the supplier, since F is strictly increasing. Therefore, given the optimal ordering 
quantity of the retailer q, the wholesale price is 

w(q ) = r — (r — v)F(q). 

The objective of the supplier is to maximize its own profit, which can be written 
as a function of the ordering quantity of the retailer: 

TT s (q) = ( w(q ) - c)q = ((r - c) - (r - v)F{q))q. 

Of course, if the cumulative distribution function F is too general, there is no 
guarantee that the supplier has a unique optimal wholesale price. Hence, we fo- 
cus on demands with increasing generalized failure rate (IGFR) distributions, i.e. , 
distributions such that qF'(q)/(l — F(q)) is increasing. Notice that several com- 
monly used distributions, such as the normal distribution and the exponential 
distribution, are IGFR distributions. 
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We now show that for IGFR demand distributions, the optimal wholesale price 
of the supplier is unique. First observe that the first order optimality condition 
implies that the retailer’s ordering quantity, associated with the supplier optimal 
wholesale price, satisfies 

K(<l) = “(c - v) + (r - v)(l - F(q) - F'(q)q) = 0, 



or 



qF'(q) c — v 1 
1 — F(q) r — v 1 — F(q) ’ 



( 10 . 2 ) 



Notice that the lefthanded side of the above equation is decreasing in q while the 
righthanded side of the equation is increasing in q. Hence, there is a unique solution 
q* for equation (10.2) and therefore w(q*) is the unique optimal wholesale price 
of the supplier. Furthermore, it is easy to verify that n' s ( q) < 0 for q < q* and 
ir' s (q) > 0 for q > q* . In other word, ir s (q) is decreasing for q < q* and increasing 
for q > q * . Thus, ir s (q) is unimodal. 

In summary, for the wholesale contract, there exists a unique Nash equilibrium 
for the Stackelberg game between the supplier and the retailer when the demand 
distribution is IGFR. In addition, in such a contract, the retailer always orders 
less than the quantity which would be optimal for the entire supply chain due to 
the fact that it bears all the risks of overstocking. Thus, the wholesale contract 
does not coordinate the supply chain. 



10.3 Buy Back Contracts 

The previous discussion reveals that wholesale contracts do not coordinate the 
supply chain, since the retailer bears all the risks of overstocking and tends to 
order less than the amount that would be optimal for the entire system. Thus, 
one might expect that the retailer is willing to order more and hence improve the 
performance of the supply chain if the supplier would share some of its risks. 

Buy back contracts provide such a mechanism for the supplier to share the risks 
with the retailer. In such a contract, the supplier specifies a wholesale price Wb and 
a buy back price b. This contract is similar to the wholesale price contract, i.e. , 
the retailer orders from the supplier according to a wholesale price Wb ■ However, 
one significant difference is that in addition to a unit salvage value v for unsold 
items, the retailer can get refund from the supplier for a unit price b. 

Given a wholesale price Wb, a buy back price 6, and an order quantity q , the 
retailer’s expected profit is 

iTr(wb,b,q ) = — Wbq + rE D [m.m.(q, D)] + (b + v)E D [max.(q — D, 0)] 

= (r — Wb)q — (r — b — v)Er>[max(q — D, 0)]. 

Consider now a wholesale price Wb and a buy back price b satisfying the following 
requirements 

r — Wb = A (r — c) and r — b — v = A(j — v) 
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for some A € [0, 1], or alternatively, 

w b — r — A (? — c) and b = (1 — A)(r — v). 

This implies that the expected profit of the retailer is given by 

TTr(w b ,b,q) = Xn°(q). 

Hence, the optimal order quantity of the retailer equals q°, the optimal production 
quantity of the entire supply chain. Similarly, the expected profit of the supplier 
is given by 

n b s (w b ,b } q) = (1 - A)7r°(g). 

Thus, the supplier’s optimal production quantity is also equal q°. Therefore, the 
system’s expected profit is maximized and the buy back contract coordinates the 
supply chain. Furthermore, in this case, the retailer receives A of the system’s 
expected profit and the supplier seizes (1 — A) of the system’s expected profit. 



10.4 Revenue Sharing Contracts 

A different contract that allows for risk sharing between suppliers and retailers is 
the so-called revenue sharing contract. In a revenue sharing contract, the retailer 
and the supplier agree on the wholesale price, typically discounted wholesale price, 
and in return the supplier receives a given fraction of the revenue from each unit 
sold by the retailer. Of course, since the supplier receives some of the revenue, 
it has an incentive to reduce the wholesale price and hence increase the amount 
ordered by the retailer. 

Assume that the wholesale price is w r and the supplier receives a fraction (1 — </>) 
of the retailer’s revenue. Thus, the retailer’s profit is 

TT^.(w r ,(j),q) = ~w r q + 4>{rE D [min(q, D)] + vE D [max(q - D, 0)]) 

= ( 4>r — w r )q — <f>(r — v)Er,\max.{q — D, 0)]. 

If we choose (j> and w r such that 

4>r — w r = A (r — c), 



and 



<j>= A 



for some A £ [0, 1], then 

K(w r ,<t>,q) = Xn°(q). 
Similarly, the supplier’s expected profit is given by 



K(w b ,b,q) = (1 - A)7r°(g). 
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Thus, both the retailer’s optimal ordering quantity and the the supplier’s optimal 
production quantity equal q°, the optimal production quantity for the entire supply 
chain. Hence, the system’s expected profit is achieved and revenue sharing contract 
coordinates the supply chain. 

Furthermore, if the wholesale price is w r (X) = Ac and the retailer shares a 
fraction <f>(X) = X of its expected revenue with the supplier, the retailer receives 
a fraction A of the system’s expected profit and the supplier seizes (1 — A) of the 
system’s expected profit. 

Notice that both the buy back contract with parameters (w&(A),6(A)) and the 
revenue sharing contract with parameters (w r (A), <^>(A)) coordinate the supply 
chain and have the same allocation of the system’s expected profit to the sup- 
plier and the retailer. 

In fact, a revenue sharing contract with parameters (w r ( A), 0(A)) is equivalent 
to the following contract: the wholesale price is w r (X) + (1 — A )r, and the retailer 
receives r for each sold unit and gets a refund equal to (1 — A)r — (1 — <p(X))v from 
the supplier for each salvaged unit. It is easy to verify that this is exactly the buy 
back contract with parameters (u>;>(A), 6(A)). 

The following example illustrates the impact of supply contracts in practice. 

Until 1998, video rental stores used to purchase copies of newly released movies 
from the movie studios for about 65andrentthemtocustomersfor3. Because of the 
high purchase price, rental stores did not buy enough copies to cover peak demand, 
which typically occurs during the first 10 weeks after a movie is released on video. 
The result was low customer service level; in a 1998 survey, about 20 percent of 
customers could not get their first choice of movie. Then, in 1998, Blockbuster 
Video entered into a revenue-sharing contract with the movie studios in which 
the wholesale price was reduced from 65fo8 per copy, and, in return, studios were 
paid about 30-45 percent of the rental price of every rental. This revenue-sharing 
contract had a huge impact on Blockbuster revenue and market share. Today, 
revenue sharing is used by most large video rental stores, see Cachon and Lariviere 
( 2000 ). 

10.5 Portfolio Contracts 

A recent trend for many industrial manufacturers has been outsourcing; firms 
are considering outsourcing everything from production and manufacturing to the 
procurement function itself. Indeed, in the mid 90s, there was a significant increase 
in purchasing volume as a percentage of the firm’s total sales. More recently, 
between 1998 and 2000, outsourcing in the electronic industry has increased from 
15 percent of all components to 40 percent. 

Of course, the increase in the level of outsourcing implies that the procurement 
function becomes critical for a manufacturer to remain in control of its destiny. 
Thus, an effective procurement strategy has to focus on both driving costs down 
and reducing risks. These risks include both inventory and financial risks. By 
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inventory risks we refer to inventory shortages while financial risks refers to the 
purchasing price which is uncertain if the procurement strategy depends on spot 
markets. 

A traditional procurement strategy that eliminates financial risk is the use of 
fixed commitment contracts. These contracts specify a fixed amount of supply to 
be delivered at some point in the future; the supplier and the manufacturer agree 
on both the price and the quantity delivered to the manufacturer. Thus, in this 
case, the manufacturer bears no financial risk while taking huge inventory risks 
due to uncertainty in demand and the inability to adjust order quantities. 

One way to reduce inventory risk is through option contracts, in which the buyer 
pre-pays a relatively small fraction of the product price up-front, in return for a 
commitment from the supplier to reserve capacity up to a certain level. The initial 
payment is typically referred to as reservation price or premium. If the buyer does 
not exercise the option, the initial payment is lost. The buyer can purchase any 
amount of supply up to the option level, by paying an additional price, agreed 
to at the time the contract is signed, for each unit purchased. This additional 
price is referred to as execution price or exercise price. Of course, the total price 
(reservation plus execution price) paid by the manufacturer for each purchased 
unit is typically higher than the unit price in a fixed commitment contract. 

Evidently, option contracts provide the manufacturer with flexibility to adjust 
order quantities depending on realized demand and hence these contracts reduce 
inventory risk. Thus, these contracts shift risks from the manufacturer to the 
supplier since the supplier is now exposed to customer demand uncertainty. This 
is in contrast to fixed commitment contracts in which the manufacturer takes all 
the risk. 

Thus, consider a single period model in which the manufacturer can procure 
a single product from multiple sources. For example, consider automotive man- 
ufacturing companies purchasing steel or PC manufacturers procuring memory 
units. 

The manufacturer faces stochastic demand D, and sell the finished product at 
a unit selling price r. Unsold items have a unit salvage value v. Most importantly, 
we assume that there are a total of n suppliers and before the planning horizon, 
the retailer signs an option contract with each supplier. That is, the manufacturer 
reserves capacity Xi with the ith supplier for a reservation cost u,; per unit of 
capacity reserved, and pays an execution fee of w l for each unit ordered from the 
supplier, after demand is realized. Thus, the procurement strategy of the retailer 
is a portfolio contract consisting of n option contracts with parameters ( Vi , Wi, Xj). 

The class of portfolio contracts contains several widely used contracts. This in- 
cludes, for instance, long term contracts, buy back and flexibility contracts. A long 
term contract specifies a fixed amount of supply, x, to be delivered at a predeter- 
mined time in the future for a given price, v. Thus, it is equivalent to a portfolio 
contract consisting of only one option contract with parameters (v, 0, x), i.e., with 
positive reservation price and zero execution cost. In the long term contract, the 
buyer bears all the risks of overstocking or understocking due to uncertain demand 
and its inability to adjust order quantity. 
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A flexibility contract specifies a fixed amount of supply, x, for a given price, v. 
In addition, in this contract, the amount to be delivered and paid can differ from 
the specified quantity by no more than a given percentage, say a, determined upon 
signing the contract. That is, the order quantity is within the interval [(1 —a)x, (1+ 
a) a;]. The flexibility contract is equivalent to a portfolio contract consisting of a 
long term contract with parameters (h,0, (1 — a)x) and an option contract with 
parameters (0,v,2ax). 

Given that the retailer has to procure q units from the suppliers, it can choose 
an appropriate combination of suppliers so that its cost is minimized. Let R{q) be 
the optimal cost for procuring q units from the n suppliers. We have 



R(q) = E”= i ViXi+ min Yh=\ w iQi 



subject to 



ElLi Qi = q (10-3) 

0 < qi < x i, for alii = 1, 2, . . . , n. 



It is easy to prove that R(q) is a convex piecewise linear function of q. 

Given an initial inventory level / and the order quantity q, the buyer’s expected 
profit is 

= G{I + q)- R(q), 

where 

G(q) = r.E[min(g, D)\ + r;£?[max(0, q — D)}. 

In the following, we characterize the optimal replenishment policy for the re- 
tailer. First, we present a result which illustrates how the the optimal order quan- 
tity changes monotonically as a function of the initial inventory level when the 
retailer’s ordering cost function is convex while its revenue function is concave. 



Theorem 10.5.1 Assume that the ordering cost function R is convex and the 
revenue function G is concave. Moreover, f(I, q) — > oo for q — > oo for any I. 
Then, there exists a function q*(I) solving 

max/(J, q) (10.4) 

q> 0 

such that q*(I) is non-increasing and I + q*(I ) is nondecreasing. 



Proof. First observe that q is an optimal solution for the optimization problem 
(10.4) if and only if q' = —q is optimal for the following problem 

ma. xg(I,q') := G(I - q') - R{-q'). (10.5) 

q'< 0 

Let q'(I) = min{ q' < 0 \ q' solves (10.5)}. Since G is concave, Theorem 2.3.6 
implies that g{I , q 1 ) is supermodular. Therefore, from Theorem 2.3.7, we have that 
q'(I) is nondecreasing. Thus q*(I) = —q’(I) solves (10.4) and is non-increasing. 

To prove the remaining part of the theorem, observe that q is an optimal solution 
for the optimization problem (10.4) if and only if I’ = I + q is optimal for the 
following problem 

ma xg(I,I'):=G(I')-R(I'-I). 



( 10 . 6 ) 
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Since R is convex, Theorem 2.3.6 implies that g(I , I') is supermodular. There- 
fore, from Theorem 2.3.7 and the definition of q*(I ), we have that /'(/) = I + q*(I) 
is nondecreasing. I 

The above result implies that the optimal ordering quantity is a non-increasing 
function of the initial inventory level, while the end period inventory level, / + 
q*(I) — E[D], is a nondecreasing function of the initial inventory level. 

Finally, we characterize the structure of the optimal ordering policy of the re- 
tailer when a portfolio contract is employed by the retailer. As we already pointed 
out, the order cost function R(q) is convex piecewise linear. In fact, without loss 
of generality, assume that w\ < w 2 < . . . < w n . Define zq = 0 and Zi = Xq=i x j 
for i = 1, 2, . . . , n. Then, 

n i — 1 

R(q) = ^ V 3 X 3 + X! W 3 X 3 + Wi ( q ~ for q G [ z i-l, z i\- 

3= 1 3=1 

Hence for q G = w. t and for q = z i} dR(q) = [wi,w i+1 \. 

Theorem 10.5.2 If the ordering cost function R(q) is given by (10.3), then there 
exists inventory levels /; (i = 1, 2, . . . , 2n + 1) with 

oo = fo > fl > ■ ■ ■ > f 2 n > hn+1 = 0 



such that 

(a) for I G [/ 2 », / 2 i-i)j it is optimal to set I' = I + q to a constant level such 
that u\ G dG(I + q) . 

(b) for I G [/ 2 »+i) f 2 i), it is optimal to set the ordering quantity q to the constant 
level Zi . 

Proof. For i = 1, 2, . . . , n, let 

qi = max{</* > 0 | q* maximizes G(q) — Wj.q subject to q > 0}. 

Theorem 2.3.4 and Theorem 2.3.7 imply that qi < qi-\ for i = 2,3, . . . ,n. 

Let fo = OO, f2n+l = 0, 

/ 2 ,-i = max(gj - z,-i, 0) and f 2i = ma x(® - Zj, 0), i = 1, 2, . . . , n. 

Then, oo = f 0 > fi > . . . > f 2n > f 2 n+i = 0. We claim that /,; (i = 0, 1, . . . , 2n+l) 
satisfies part (a) and part (b). 

First, notice that for / G [/ 2 *,/ 2 »-i) 7^ 0, we have qt > 0 and q = q.j — I G 
(zi-i,Zi\. The first order optimality condition implies that Wi G dG(qi). Hence 
q*(I) = qi — I is optimal for problem (10.4), since we have 0 G d(G(I + q) — 
R(q))\ q - q *^). Thus, part (a) is true. 

On the other hand, for / G [/ 2 i+i, / 2 j) with i > 1, we claim that the optimal 
ordering quantity q*(I) = Zi. In fact, observe that for I = f 2 i , we have q*(I) = 
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qi — I = Zi and for I = f'n+i > 0, we have q*(I) = qi+i — I = Zi- Thus, Theorem 
10.5.1 implies that for / G [/ 2 »+i> / 2 i)> <?*(/) = z. t is optimal. Finally, for / > /i, it 
is clear that q*(I) = 0 is optimal. Hence part (b) holds. I 

Figure (10.1) illustrates the structure of the optimal ordering policy identified 
in Theorem 10.5.2 for a case with n = 2. 

order-up-to level 

i 




order quantity 

i 




FIGURE 10.1. Illustration of the structure of the optimal ordering policy 



10.6 Exercises 



Exercise 10.1. Prove that the normal distribution and the exponential distribu- 
tion have increasing generalized failure rate (IFGR). 

Exercise 10.2. As we have shown in Section 10.2, wholesale contracts do not 
coordinate the supply chain in general. Now assume that the supplier is willing to 
provide all unit quantity discount. Design an all unit quantity discount contract 
coordinating the supply chain, i.e., find a per unit wholesale price w(q) as a de- 
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creasing function of the order quantity q such that the optimal ordering quantity of 
the retailer and the optimal production quantity of the supplier equal the optimal 
production quantity of the whole system. 

Exercise 10.3. Show that a buy back contract is a special case of portfolio con- 
tracts. 

Exercise 10.4. Show that Theorem 10.5.2 implies the optimality of a modified base 
stock policy. In such a policy, there exist inventory target levels 6* > 0 (i = 1, • • • , n) 
with bt < max(0, 6,+i — Xi+±) for i = 1, ■ • • , n — 1, such that it is optimal to order 
nothing if / > bp, order — / if I € [max(0, bi — Xi), 6,;]; or order Xi otherwise. 

Exercise 10.5. Consider a single manufacturer and a single supplier. Six months 
before demand is realized the manufacturer has to sign a supply contract with 
the supplier. Let D be a random variable representing demand and /(£>) is the 
demand density function. Let p be the selling price, i.e., the price at which the 
manufacturer sells products to consumers. 

The sequence of events is as follows. Procurement contracts are signed in Febru- 
ary and demand is realized during a short period of ten weeks that starts in August. 
Components are delivered from the supplier to the manufacturer at the beginning 
of August and the manufacturer produces items to customer orders. Thus, we can 
ignore any inventory holding cost. We will assume that unsold items at the end of 
the ten week selling period have zero value. Finally, assume that the manufacturer 
can also purchase additional items in the spot market. Let s be a random variable 
representing the per-unit spot market price and /(s) is its density function. The 
objective is to identify a procurement strategy so as to maximize expected profit. 

Assume the supplier offers an option contract in which the per-unit reservation 
price is v and the per-unit execution price is w. Given the existence of the spot 
market, how much capacity should the manufacturer reserve with the supplier 
when the contract is signed in February? 
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11.1 Introduction 

In the last decade many companies have recognized that important cost savings 
and improved service levels can be achieved by effectively integrating produc- 
tion plans, inventory control and transportation policies throughout their supply 
chains. The focus of this chapter is on planning models that integrate decisions 
across the supply chain for companies that rely on third party carriers. These mod- 
els are motivated in part by the great development and growth of many competing 
transportation modes, mainly as a consequence of deregulation of the transporta- 
tion industry. This has led to a significant decrease in transportation costs charged 
by third party distributors and, therefore, to an ever growing number of companies 
that rely on third party carriers for the transportation of their goods. 

One important mode of transportation used in the retail, grocery and electronic 
industries is the LTL (Less-than-TruckLoad) mode, which is attractive when ship- 
ment sizes are considerably less than truck capacity. Typically, LTL carriers offer 
volume, or quantity, discounts to their clients to encourage demand for larger, 
more profitable shipments. In this chapter we model these discounts as a piece- 
wise linear concave function of the quantity shipped. 

Similarly, production costs can often be approximated by piece-wise linear and 
concave functions in the quantity produced, e.g, set-up plus linear manufacturing 
costs. These economies of scale motivate the shipper to coordinate the production, 
routing and timing of shipments over the transportation network to minimize 
system- wide costs. In what follows, we refer to this problem as the Shipper Problem. 

This planning model, while quite general, is based on several assumptions which 
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are consistent with the view of modern logistics networks. Indeed, the model deals 
with situations in which all facilities are part of the same logistics network, and 
information is available to a central decision maker whose objective is to optimize 
the entire system. Thus, distribution problems in the retail and grocery indus- 
tries are special cases of our model where the logistics network does not include 
manufacturing facilities. 

The model also applies to situations in which suppliers and retailers are engaged 
in strategic partnering. For instance, in a Vendor Managed Inventory (VMI) part- 
nership, point-of-sales data is transmitted to the supplier, which is responsible for 
the coordination of production and distribution including managing retail inven- 
tory and shipment schedules. Hence, in this case, the model includes manufacturing 
facilities, warehouses and retail outlets. 

This deterministic tactical model is motivated, in part, by our experience with 
a number of companies who apply similar models on a rolling-horizon basis. That 
is, they consider forecast demand for the next fifty two weeks and allow the model 
to generate a production, transportation and inventory schedule for the entire 
planning horizon. The use of a rolling-horizon implies that these companies employ 
the plan generated by the model only for a few time periods, say for the first three 
or four weeks. As time goes on, they update the demand forecasts and run the 
model again. 

While this model is deterministic, in practice, safety stocks are determined ex- 
ogenously and incorporated into the minimum inventory level that should be main- 
tained at the beginning of each period. Of course, an important question when 
managing inventory in a complex supply chain is where to keep safety stock? The 
answer to this question clearly depends on the desired level of service, the logistic 
network, demand forecast and forecast error as well as lead times and lead time 
variability. Thus, in Section 11.3 we discuss models for positioning and optimizing 
safety stock in the supply chain. We start in the next section with our modeling 
approach and results for the Shipper Problem. 



11.2 The Shipper Problem 

In this section, we focus on the Shipper Problem under piece-wise linear and con- 
cave production and transportation costs, and use properties resulting from the 
concavity of the cost function to devise an efficient algorithm. 

The objective of the shipper is to find a production plan, an inventory policy 
and a routing strategy so as to minimize total cost and satisfy all the demands. 
Backlogging of demands may be allowed, incurring a known penalty cost which 
is a function of the length of the shortage period and the level of shortage. In 
this case, four different costs must be balanced to obtain an overall optimal pol- 
icy: production costs, LTL shipping charges, holding costs incurred when carrying 
inventory at some facility and penalty costs for delayed deliveries. 

To formulate this tactical problem we first incorporate the time dimension into 
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the model by constructing the so-called expanded network. This expanded net- 
work is used to formulate the Shipper Problem as a set-partitioning problem. The 
formulation is found to have surprising properties, which are used to develop an 
efficient algorithm and to show that the linear programming relaxation of the 
set-partitioning formulation is tight in certain special cases (Subsection 11.2.4). 
Computational results, demonstrating the performance of the algorithm on a set 
of test problems, are reported in Subsection 11.2.5. 



11.2.1 The Shipper Model 

Consider a generic transportation network, G = (N, A), with a set of nodes N 
representing the suppliers, warehouses and customers. Customer demands for the 
next T periods are assumed to be deterministic and each of them is considered 
as a separate commodity, characterized by its origin, destination, size and the 
time period when it is demanded. Our problem is to plan production and route 
shipments over time so as to satisfy these demands while minimizing the total 
production, shipping, inventory and penalty costs. 

A standard technique to efficiently incorporate the time dimension into the 
model is to construct the following expanded network. Let ti,T 2 ,. .. ,tt be an 
enumeration of the relevant time periods of the model. In the original network, G, 
each node i is replaced by a set of nodes i\, *2, . . . , ir- We connect node i u with node 
j v if and only if t v — t u is exactly the time it takes to travel from i to j. Thus, arc 
iu jv represents freight being carried from i to j starting at time t u and ending 
at time t v . We call such arcs shipping links. In order to account for penalties 
associated with delayed shipments, a new node is created for each commodity 
and serves as its ultimate sink. For a given commodity, a link between a node 
representing its associated retailer at a specific time period, and its corresponding 
sink node, represents the penalty cost of delivering a specific shipment in that time 
period, and is called penalty link. Similarly, to include production decisions in the 
network model, we add for each node it corresponding to a production facility 
(supplier) i at a particular point in time t, a dummy node i' t and an arc from 
i' t to it. whose cost represents the piece-wise linear concave manufacturing costs. 
Observe that this production links have the same cost structure as the shipping 
links. Consequently, in our analysis of the network model we will include them 
in the set of shipping links. Finally, we add links (ii,ii+ 1) for l = 1, 2, . . . , T — 1, 
referred to as inventory links. 

Let Gt = (V. E) be the expanded network. Figure 11.1 illustrates the ex- 
panded network for a simple scenario where the shipping and inventory costs have 
to be balanced over a time horizon of just three periods and shortages are not 
allowed. For simplicity, we assume that travel times are zero. 

Observe that, using the expanded network, the shipper problem can be formu- 
lated as a concave-cost multicommodity network flow problem. 
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Simple Scenario Associated Expanded Network 




FIGURE 11.1. Example of expanded network 

11.2.2 A Set- Partitioning Approach 

To describe our modeling approach, we introduce the following notation. Let K, = 
{1, 2, . . . , K} be the index set of all commodities, or different demands with fixed 
origin and destination, and let Wk, k = 1,2,..., A, be their corresponding size. 
For instance, commodity k = 1 may correspond to a demand of w \ = 100 units 
that needs to be shipped from a certain supplier to a certain retailer and must 
arrive by a particular period of time or incur delay penalties. Let the set of all 
possible paths for commodity k be Pk and let c p k be the sum of inventory and 
penalty costs incurred when commodity k is shipped along path p G Pk- Observe 
that the shipping cost associated with a path will depend on the total quantity 
of all commodities being sent along each of its shipping links and, consequently, 
it can’t be added to the path cost a priori. Thus, each shipping edge, whose cost 
must be globally computed, needs to be considered separately. Let the set of all 
shipping edges be SE and for each edge e € SE, let 2 e be the total sum of weight 
of the commodities traveling on that edge. 

We assume that the cost of a shipping edge e, e G SE, of the expanded network 
Gt{V,E), is F e (z e ), a piece-wise linear and concave cost function which is 
non-decreasing in the total quantity, z e , of the commodities sharing edge e. As 
presented in Balakrishnan and Graves (1989), this special cost structure allows for 
a formulation of the problem as a mixed integer linear program. For this purpose, 
the piece- wise linear concave functions are modeled as follows. Let R be the number 
of different slopes in the cost function, which we assume, without loss of generality, 
is the same for all edges to avoid cumbersome notation. Let MJ _1 , MJ, r = 
1, . . . , -R, denote the lower and upper limits, resp., on the interval of quantities 
corresponding to the rth slope of the cost function associated with edge e. Note 
that = 0 and M f R can be set to the total quantity of all commodities that may 




11.2 The Shipper Problem 



183 




FIGURE 11.2. Piece-wise linear and concave cost structure 



use arc e. We associate with each of these intervals, say r, a variable cost per unit, 
denoted by a r e , equal to the slope of the corresponding line segment, and a fixed 
cost, fl , defined as the y-intercept of the linear prolongation of that segment. See 
Figure 11.2 for a graphical representation. Observe that the cost incurred by any 
quantity on a certain range is the sum of its associated fixed cost plus the cost of 
sending all units at its corresponding linear cost. That is, we can express the arc 
flow cost function, F e (z e ), as 

F e (z e ) = f r e +a r e z e , 

if z e G (Mg -1 , MJ], Clearly, 

Property 11.2.1 The concavity and monotonicity of the function F e implies that, 

1. af > otf > . . . > > 0, 

2. 0</i <fl <...</*, 

3. F e (z e ) = min r=1 /j {f r e + Og z e J. The minimum is achieved at a unique 
index s, unless z e = Mf, in which case the two consecutive indexes s and 
s + 1 lead to the same minimum cost. 

We are now ready to introduce an integer linear programming formulation of the 
Shipper Problem for this special cost structure. Recall that z e denotes the total 
flow on edge e and let z e k be the quantity of commodity k that is shipped along 
that edge. For all e G SE and r = 1, . . . , R define the interval variables, 

r_f 1, if z e G (Mg -1 , Mg], 

1 0, otherwise, 
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and, in addition, for every k, k G /C, let the quantity variables be 

r _ (z ek , if z e &{Ml~\Ml], 

Z e u — \ 

i 0, otherwise. 

In order to relate these edge flows to path flows we define, for each e G SE and 

p e (JE Pk, 

f 1, if shipping link e is in path p, 

Si = < 

^0, otherwise. 



Finally, let variables 

{ 1, if commodity k follows path p in the optimal solution 
0, otherwise, 



for each k € 1C and p G P k . These variables are referred to as path flow variables. 
Observe that defining these variables as binary variables implies that for every 
commodity k only one of the variables y pk takes a positive value. This reflects a 
common business practice in which each commodity, that is, items originated at 
the same source and destined to the same sink in the expanded network, is shipped 
along a single path. These integrality constraints are, however, not restrictive, as 
pointed out in Property 11.2.2 below, since the problem is uncapacitated and the 
cost functions concave. 

In the Set- Partitioning formulation of the Shipper Problem, the objective is 
to select a minimum cost set of feasible paths. Thus, we formulate the shipper 
problem for piece-wise linear concave edge costs as the following mixed integer 
linear program, which we denote by Problem P. 



Problem P : 



K 



Min EE VpkCpk H - E E[/< 

k—lpEPk e€SEr= 1 



K 

r *e + «e(E*e*) 



k = 1 



S.t. 



^ ' Vpk 1; Wk — 1)2,..., K, 



( 11 . 1 ) 



pePk 



^2 Spy p kW k = ^2 Ve G SE, k = l,...,K, (11.2) 

pGPk r=l 

z r ek <w k x r e Ve, r, k, (11.3) 

K 

Y,z r ek <M r e x r e , Wee SE,r = l,...,R, (11.4) 



fc=i 

K 



fc= 1 



Y^zlk > M^xl, We e SE, r = 1, . . . , R, (11.5) 

R 

^2 *e < 1 Ve G SE, 



r—1 



( 11 . 6 ) 
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y p k & { 0,1}, Vfc = 1,2, ... ,K, and p G P k , (11-7) 
x r e G {0, 1}, Ve G SE, and r=l,2,...,R, (11.8) 

z r ek > 0, Ve G SE, Vfc = 1, 2, . . . , A', 
and r = 1,2, . . . , R. 

In this formulation, constraints (11.1) ensure that exactly one path is selected 
for each commodity and constraints (11.2) set the total flow on an edge e to be 
equal to the total flow of all the paths that use that edge. Constraints (11.3)-(11.6) 
are used to model the piece-wise linear concave function. Constraints (11.3) specify 
that if some commodity k is shipped on edge e using cost index r, the associated 
interval variable, x r e , must be 1. Constraints (11.4) and (11.5) make sure that if 
cost index r is used on edge e, then the total flow on that edge must fall in its 
associated interval, Finally, constraints (11.6) indicate that at most 

one cost range can be selected for each edge. 

Let Z* be the optimal solution to Problem P. Let Z Rx and Z} i y be the optimal 
solutions to relaxations of Problem P where the integrality constraints of inter- 
val (x) and path flow (y) variables, respectively, are dropped. A consequence of 
Property 11.2.1 is the following result. 

Property 11.2.2 We have, 



Z* = Z Rx =Z Ry . 

To find a robust and efficient heuristic algorithm for Problem P , we study the 
performance of a relaxation of Problem P that drops integrality and redundant 
constraints. Although constraints (11.3) are not required for a correct mixed- 
integer programming formulation of the problem, we keep them because they 
improve significantly the performance of the linear programming relaxation of 
Problem P. In fact, Croxton, Gendron and Magnanti (2000) show that, without 
them, the linear programming relaxation of this model approximates the piece- wise 
linear cost functions by their lower convex envelope. Furthermore, keeping these 
constraints makes constraints (11.4)-(11.6) redundant in the correct mixed-integer 
programming formulation, as a direct consequence of Property 11.2.1 part 3, and 
in the linear programming relaxation of problem P as well, as Lemma 11.2.3 below 
shows. This will be useful to considerably reduce the size of the formulation of the 
problem, while preserving the tightness of its linear programming relaxation. 

Let Problem P RP be the linear program obtained from Problem P by relaxing 
the integrality constraints and constraints (11.4)-(11.6). That is, 

K R K 

Problem Pf p : Min ^ ^ y pk c pk + [E x l + z lk) 

k—1 pGPk e€SEr = 1 k— 1 

S.t. 

(11.1) - (11.3) 

Vpk > o, VA; = 1,2,..., K, and p G P k , 
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x r e >0, Ve G SE, and r = 1,2 ,R, 
z r ek >0, Ve G SE, Vfc = 1,2,..., K, 
and r = 1, 2, . . . , R. 

Chan, Muriel and Simchi-Levi (1999) prove the following. 

Lemma 11.2.3 The optimal solution value to Problem P pp is equal to the optimal 
solution value to the linear programming relaxation of Problem P. 

11.2.3 Structural Properties 

To analyze the relaxed problem, we start by fixing the fractional path flows and 
study the behavior of the resulting linear program. Let y = (y p k) be the vector of 
path flows in a feasible solution to the relaxed linear program, Problem P pp . 

Observe that, given the vector of path flows y, the amount of each commodity 
sent on each edge is known and, thus, Problem P pp can be decomposed into 
multiple subproblems, one for every edge. Each subproblem determines the cost 
that the linear program associates with the corresponding edge flow. We refer to 
the subproblem associated with edge e as the Fixed-Flow Subproblem on edge e, 
or Problem FFy. 

Let the proportion of commodity k shipped along edge e be 

7 ek = ^ ' SpUpk- 
pePk 

Using equation (11.2), the equality z ek = w kTek must clearly hold; that is, 

the sum of all the flows of commodity k on the different cost intervals on edge e 
must be equal to the total quantity, WkTeki of commodity k that is shipped on 
that edge. 

For each edge e, the total shipping cost on e, as well as the value of the cor- 
responding variables z r ek and x r e , that Problem P pp associates with the vector of 
path flows y, can be obtained by solving the Fixed Flow Subproblem on edge e: 

R K 

Problem FF* : Min J2^ x e + a l z lk\ 

r= 1 k = 1 

S.t. 

z r ek < WkX r e \/k = 1, . . . , K, and r = 1, . . . , R, (11.9) 

R 

Z ek = w kTek, Vfc = 1, . . . , K, (11.10) 

r = 1 

z r ek > 0, Vfc = 1, . . . , K, and r = 1, . . . , R, 

<>0, Vr = 1, . . . , R. 

Let C*(y) = C*( 7 e i, . . . ,7 e A') be the optimal solution to the Fixed-Flow Sub- 
problem on edge e for a given vector of path flows y, or, equivalently, for given 
corresponding proportions 7 e i, . . . ,”f e K of the commodities shipped on that edge. 
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The following Theorem determines the solution to the subproblem. 

Theorem 11.2.4 For any given edge e £ SE, let the proportion 7 e fc of commodity 
k to be shipped on edge e be known and fixed , for k = 1,2 and let the 
commodities be indexed in non- decreasing order of their corresponding proportions, 
that is, 

7el < 7e2 < • • • < 7 eK- 

Then, the optimal solution to the Fixed-Flow Subproblem on edge e is 

K K 

C'e(7el,---,7e.Jf) = ^F e (^Wi)Y) ek ~ Tek-l], (H-ll) 

k — 1 i=k 

where 7 e o := 0. 

Intuitively, the above Theorem just says that in an optimal solution to the Fixed 
Flow Subproblem associated with any edge e, fractions of commodities are consoli- 
dated to be shipped at the cheapest possible cost per unit. At first, a fraction 7 e i of 
all commodities 1,2, ... ,K is available. Thus, these commodities get consolidated 
to achieve a cost per unit of F e (X^fcLi w k)/'42^—i Wk > i- e - the cos t P er lm ^ associ- 
ated with sending the full K commodities on that edge, and the available fraction 
7ei is sent incurring a cost of 7 e i F e (Ylk=i w k)- At that point, none of commodity 
1 is left and a fraction (y e 2 — "f e i) is the maximum available simultaneously from 
all commodities 2, 3, . . . , K. Again these commodities get consolidated and that 
fraction, (j e 2 ~ 7 e i), from each commodity is sent at a cost (7 e 2~7ei )Pe(Y 4 k =2 Wk )• 
This process continues until the desired proportion of each commodity has been 
sent. 



11.2.4 Solution Procedure 

Theorem 11.2.4 provides a simple expression of the cost that the relaxed problem, 
Problem Ppp, assigns to any given fractional path flows and thus it allows for the 
efficient computation of the impact of modifying the flow in a particular path. 
This is the key to the algorithm developed in this section. Indeed, the algorithm 
transforms an optimal fractional solution to the linear program Pffp into an integer 
solution by modifying path flows, choosing for each commodity the path that leads 
to the lowest increase in the objective of the linear program. 

The Linear Programming Based Heuristic: 

Step 1: Solve the linear program, Problem Pp p . Initialize k = 1. 

Step 2: For each arc compute a marginal cost which is the increase in cost in- 
curred in the Fixed Flow Subproblem by augmenting the fractional flow of 
commodity k to 1. Note that this is easy to compute using Theorem 11.2.4. 

Step 3: Determine a path for commodity k by finding the minimum cost path on 
the expanded network with edge costs equal to the marginal costs. 
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Step 4 : Update the flows and the costs on each link (again employing Theorem 
11.2.4) to account for commodity k being sent along that path. 

Step 5: Let k = k + 1 and repeat steps (2)-(5) until k = K + 1. 

Evidently, the effectiveness of this heuristic depends on the tightness of the linear 
programming relaxation of Problem P. For this reason, we study the difference 
between integer and fractional solutions to Problem P. Chan, Muriel and Simchi- 
Levi show that in some special cases an integer solution can be constructed from 
the optimal fractional solution of Problem Pj^ p without increasing its cost. In 
particular, using Theorem 11.2.4, they prove the following result. 

Theorem 11.2.5 In the following cases: 

1. Single period, multiple suppliers, multiple retailers, two warehouses, 

2. Two periods, single supplier, multiple retailers, single warehouse, 

3. Two periods, multiple supplier, midtiple retailers, single warehouse using a 
cross-docking strategy, 

4- Multiple periods, single supplier, single retailer, single warehouse that uses a 
cross-docking strategy, 

the solution to the linear programming relaxation of problem P is the optimal 
solution to the shipper problem. That is, 

Z* = Z LP . 

Furthermore, in the first three cases, all extreme point solutions to the linear pro- 
gram are integer. 

The cross-docking strategy referred to in the last two cases, is a strategy 
in which the stores are supplied by central warehouses which do not keep any 
stock themselves. That is, in this strategy, the warehouses act as coordinators of 
the supply process, and as transshipment points for incoming orders from outside 
vendors. 

The Theorem thus demonstrates the exceptional performance of the linear pro- 
gramming relaxation, and consequently of the heuristic, in some special cases. A 
natural question at this point is whether these results can be generalized. The 
answer is no in general. To show this, Chan, Muriel and Simchi-Levi construct ex- 
amples with a single supplier, a single warehouse and multiple retailers and time 
periods, for which 



as the number of retailers and time periods increases. 

Lemma 11.2.6 The linear programming relaxation of Problem P can be arbitrar- 
ily weak, even for a single- supplier, single-warehouse, multi-retailer case in which 
demand for the retailers is constant over time. 
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It is important to point out that the instances in which the heuristic solution is 
found to be arbitrarily bad are characterized by the unrealistic structure of the 
shipping cost. In these instances, the shipping cost between two facilities is a pure 
fixed charge (regardless of quantity shipped) in some periods, linear (with no fixed 
charges) in others, and yet prolribitely expensive so that nothing can be shipped 
in the remaining periods. The following examples illustrate this structure. 

Example of weak linear programming solution: Consider a three-period 
single-warehouse model in which a single supplier delivers goods to a warehouse 
which, in turn, replenishes inventory of three retailers over time. The warehouse 
uses a cross-docking strategy and, thus, it does not keep any inventory. Let trans- 
portation cost be a fixed charge of 100 for any shipment from the supplier to 
the warehouse at any period. Transportation from the warehouse to retailer i, 
i = 1,2,3, is very large for shipments made in period i (in other words, retailer i 
cannot be reached in period i) and negligible for periods j ^ i. Let inventory cost 
be negligible for all retailers at all periods, and let demand for each retailer be 0 
units in periods 1 and 2 and 100 units in period 3. 

Observe that, in order to reach the three retailers, shipments need to be made in 
at least two different periods. Thus, the optimal integer solution is 200. However, 
in the solution to the linear program 50 units are sent to each of the “reachable” 
retailers in each period, and a transportation cost of 50 is charged at each period 
(as stated in Theorem 11.2.4, since only a fraction of 1/2 of the commodities is 
sent on any edge, exactly that fraction of the fixed cost is charged). Thus, the 
optimal fractional solution is 150 and the ratio of integer to fractional solutions is 
3/2. 

In this instance, even if fractional and integer solutions are different, the lin- 
ear programming based heuristic generates the optimal integer solution. However, 
we can easily extend the above scenario to instances for which the difference be- 
tween the solution generated by the heuristic and the optimal integer solution is 
arbitrarily large. 

Example of weak heuristic solution: For that purpose, we add n new periods 
to the above setting. In period 4, the first of the new periods, the cost for shipping 
from supplier to warehouse is linear at a rate of 1/3 and the cost for shipping from 
the warehouse to each of the 3 retailers is 0. On all the other n — 1 periods the 
cost of shipping is very high and thus no shipments will be made after period 4. 
Inventory costs at all retailers and all periods are negligible. Demand for each of 
the three retailers at each of the new n periods is 100, while demand during the 
first 3 periods is 0. It is easy to see that the optimal integer and fractional solutions 
are identical to those in the 3-period case, with costs of 200 and 150 respectively. 
However, the heuristic algorithm will always choose to ship each commodity in 
period 4, since the increase in cost in the corresponding path would be 1/3 x 100 
while it is at least 50 in any of the first 3 periods. Thus, the total cost of the 
heuristic solution is 1/3 x 100 x n and the gap with the optimal integer solution 
arbitrarily large. 

The following section reports the performance of the algorithm on a set of ran- 
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domly generated instances. 

11.2.5 Computational Results 

The computational tests carried out are divided into three categories: 

1. Single-period layered networks, 

2. General networks, 

3. Multi-period single- warehouse distribution problems: 

• Pure distribution instances. 

• Production/distribution instances. 

The first two categories are of special interest because they allow us to compare 
our results with those reported by alakrishnan and Graves (1989), henceforth B&G. 
The third set of problems models practical situations in which each of the retailers 
is assigned to a single warehouse and production and transportation costs have to 
be balanced with inventory costs over time. 

In the three categories the tests were run on a Sun SPARC20 and CPLEX was 
used to solve the linear program, Problem P kp , using an equivalent formulation 
where path flow variables are replaced by flow-balance constraints. During our 
computational work, we observed that the dual simplex method is more efficient 
than the primal simplex method in solving these highly degenerate problems, an 
observation also made by Melkote (1996). This is usually the case for programs 
with variable upper bound constraints, such as our constraints z r ek < WkX r e . We 
should also point out that most of the CPU time reported in our tests is used in 
solving the linear program. Thus, to enhance the computational performance of 
our algorithm and increase the size of the problems that it is capable of handling, 
future research focused on efficiently solving the linear program is needed. For 
instance, the original set-partitioning formulation, Problem P pp , could be solved 
faster using column generation techniques. In these tests, however, we focused on 
evaluating the quality of the integer solutions provided by the heuristic and the 
tightness of the linear programming relaxation. 

We now discuss each class of problems and the effectiveness of our algorithm. 

Single-period Layered Networks 

B&G present exceptional computational results for single-period layered networks. 
In these instances, commodities flow from the manufacturing facilities to distribu- 
tion centers, where they are consolidated with other shipments. These shipments 
are then sent to a number of warehouses, where they are split and shipped to their 
final destinations. Thus, every commodity must go through two layers of interme- 
diate points: consolidation points, also referred to as distribution centers, and 
breakbulk points, or warehouses. 

To test the performance of our algorithm and to compare it with that of B&G, 
we generated instances of the layered networks following the details given in their 
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Number of 
Nodes 


Problem Class 


LTL1 


LTL2 


LTL3 


LTL4 


LTL5 


SOURCE 


4 


5 


6 


8 


10 


CONSOLIDN 


5 


10 


12 


15 


20 


BREAKBULK 


5 


10 


12 


15 


20 


DESTN 


4 


5 


6 


8 


10 


Arcs 


42-47 


131-141 


190-207 


309-312 


358-372 


Commodities 


10 


20 


30 


50 


60 



TABLE 11.1. Test problems generated as in alakrishnan and Graves 



Problem 

Class 


B&G 


LPBH 


LB/UB 

Percentage 


LP /Heuristic 
Percentage 


Avge. CPU Time 
in seconds 


LTL1 


99.8 


100 


1.04 


LTL2 


100 


100 


7.94 


LTL3 


99.6 


100 


20.74 


LTL4 


99.1 


100 


55.72 


LTL5 


99.5 


100 


100.48 



TABLE 11.2. Computational results for layered networks, alakrishnan and Graves results 
(B&G) versus those of our Linear Programming Based Heuristic (LPBH). 



paper. In this computational work, five different problem classes, referred to as 
LTL1 - LTL5, are considered. 

Table 11.1 shows the sizes of the different classes of problems. For each of these 
classes, the first column (B&G) of Table 11.2 presents the average ratio between the 
upper bounds generated by the heuristic proposed by B&G and a lower bound on 
the optimal solution, over 5 randomly generated instances. The numbers are taken 
from their paper. We do not include, though, their average CPU times because 
the machines they use are completely different than ours and, in addition, they 
do not report total computational time for the entire algorithm. The second and 
third columns report the average deviation from optimality and computational 
performance of the Linear Programming Based Heuristic (LPBH) over 10 random 
instances, for each of the problem classes. In all of them, our algorithm finds the 
optimal integer solution; furthermore, the solution to the linear program in 
the first step of our algorithm is integer, providing the optimal solution to the 
problem. 

Of course, since in all the previous instances the linear program provided the 
optimal integer solution, the performance of our procedure has not really been 
tested. In the following subsections we present computational results for problem 
classes in which the solution to the linear program is not always integer. 
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General Networks 

In this subsection, we report on the performance of our algorithm on general 
networks, in which every node can be an origin and/or a destination, generated 
exactly as they are generated by B&G. These results together with those of B&G 
are reported in Table 11.3. In this category, B&G consider five different problem 
classes, referred to as GEN1,. . ., GEN5, and generate five random instances for 
each of them. We, in turn, solve ten different randomly generated instances for 
each of the problem classes. Again, we do not include their average CPU times 
due to the reasons mentioned above. 



Problem 

Class 


Size 


B&G 


LPBH 


No. of 
Nodes 


No. of 
Arcs 


No. of 
Comm. 


LB/UB 

Percentage 


LP/Heuristic 

Percentage 


Avge. CPU Time 
in seconds 


GEN1 


10 


47-54 


10 


99.9 


100 


2.18 


GEN2 


15 


109-136 


20 


98.7 


99.53 


24.04 


GEN3 


20 


196-235 


30 


98.4 


99.88 


139.83 


GEN4 


30 


364-428 


50 


96.2 


98.59 


1313.06 


GEN5 


40 


340-370 


60 


98.5 


99.98 


159.57 



TABLE 11.3. Computational results for general networks. Balakrishnan and Graves re- 
sults (B&G) versus those of our Linear Programming Based Heuristic (LPBH). 



Multi-Period Single- Warehouse Distribution Problems 

Here we consider a single-warehouse model where a set of suppliers replenishes in- 
ventory of a number of retailers over time. We test two different types of instances: 
pure distribution instances in which the routing and timing of shipments are to be 
determined, and production/distribution instances in which the production sched- 
ule is also integrated with the transportation and inventory decisions. 

A. Pure Distribution Instances 

We assume that shortages are not allowed and analyze three different strategies: 

1. Classical Inventory /Distribution Strategy: Material flows always from the 
suppliers through a single warehouse where it can be held as inventory. 

2. Crossdocking Strategy: All material flows through the warehouse where ship- 
ments are reallocated and immediately sent to the retailers. 

3. A Distribution Strategy that Allows for Direct Shipments: Items may be sent 
either through the warehouse or directly to the retailer. The warehouse may 
keep inventory. 

For each strategy, we analyze different situations where the number of suppliers 
is either 1, 2 or 5, the number of retailers is 10, 12 or 20 and the number of periods 
is 8 or 12. For each combination of the number of suppliers, retailers and periods 
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Type of arc 


a e 


a'i 




Set-up 


Supplier- Whs. 


0.15 


0.105 


0.084 


25 


Whs. -Retailer 


0.25 


0.20 


0.16 


10 



TABLE 11.4. Linear and set-up costs used for all the test problems. 

presented in Table 11.6, 10 instances are generated. The retailers and suppliers 
are randomly located on a 1000 x 1000 grid, while the warehouse is randomly 
assigned to the 400 x 400 subgrid at the center. Demand is generated for each 
retailer-supplier pair at each time period, except for the cases with 5 suppliers in 
which each of these pairs has an associated demand with probability 1/3. These 
demands are generated from a uniform distribution on the integers in the interval 
[0,100). 

All suppliers and retailers are linked to the warehouse and the distance associ- 
ated is the corresponding Euclidean distance between the nodes of the grid. In the 
case of a Distribution Strategy that Allows for Direct Shipments, shipping edges 
from each of the suppliers to each of the retailers are added. The holding costs 
per unit of inventory are different at the warehouses and retailer facilities and are 
presented in Table 11.5. All holding costs at the suppliers are set to zero. Two 
shipping-cost functions, representing cost per item per unit distance, are consid- 
ered: The first is assigned to shipments from the suppliers to the warehouse. The 
second is incurred by the material flowing from the warehouse to the retailers. The 
cost function (dollars per mile per unit) associated with direct shipments is equal 
to that of shipments from the warehouse to a retailer. Both functions have an ini- 
tial set-up cost for using the link and three different linear rates depending on the 
quantity shipped, see Table 11.4. However, the ranges to which those linear costs 
correspond are different for the different Problem classes. This is done so that, in 
an optimal solution, shipments are consolidated and thus the concave cost func- 
tion plays an important role in the analysis. These ranges and the corresponding 
problem classes are presented in Table 11.5. 

Observe, see Table 11.6, that in most of the instances tested, the linear program 
is tight and it provides the optimal integer solution. Only in three out of the 
150 instances generated, the solution to the linear program is not integer and, in 
such cases, our algorithm finds a solution which is within 0.8% from the optimal 
fractional solution. 

B. Production/Distribution Instances 

This section demonstrates the effectiveness of the algorithm when applied to pro- 
duction/distribution systems, i.e., systems in which one needs to coordinate pro- 
duction planning, inventory control and transportation strategies over time. For 
that purpose, we consider the same set of problems, II - 13, as in the Classical 
Inventory /Distribution Strategy described in the previous section and add produc- 
tion decisions at each of the supplier sites. This is incorporated into the model as 
explained in Section 11.2. 
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Problem 


Inventory Cost 


Supplier- Whs. Cost 


Whs. - Retailer Cost 


Class 


Warehouse 


Retailer 


Range 1 


Range 2 


Range 1 


Range 2 


11 










200 


400 


12 


5 


10 


800 


1500 


300 


600 


13 










300 


600 


14 










150 


300 


15 


10 


20 


1000 


2000 


200 


400 


16 










200 


400 


Cl 










200 


400 


C2 


10 


20 


800 


1500 


300 


600 


C3 










300 


600 


C4 










150 


300 


C5 


10 


20 


1000 


2000 


200 


400 


C6 










200 


400 


D1 










150 


300 


D2 


10 


20 


500 


1000 


200 


400 


D3 










200 


400 



TABLE 11.5. Inventory costs and different ranges for the different test problems. 



We consider a fixed set-up cost for producing at any period plus a certain cost 
per unit. The set-up cost is varied in the set {50, 100, 500, 1000} and the linear 
production cost is set to 1. Inventory holding rate at the supplier site (after pro- 
duction) is set to half of that at the warehouse. For the sixty different instances 
generated, the linear programming relaxation gave an integer solution every time. 



11.3 Safety Stock Optimization 

As observed earlier, the Shipper model analyzed earlier is deterministic; safety 
stocks are determined exogenously and incorporated into the minimum inventory 
level that should be maintained at the beginning of each period. The objective of 
this section is to present a model for positioning and optimizing safety stock in 
the supply chain. 

For this purpose, consider a single product, single facility periodic review inven- 
tory model. Let 

SI be the amount of time it takes from placing an order until the facility receives 
a shipment; this time is referred to as Incoming Service Time. 

S be the Committed Service Time made by the facility to its own customers. 
T be the Processing Time at the facility. 

Of course, we must assume that SI + T > S, otherwise no inventory is needed in 
the facility. 

We assume that the facility manages its inventory following a periodic review 
policy, and that demand is independent and identically distributed across time 
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Problem 


Number of 


Number of 


Number of 


LP / Heuristic 


CPU Time 


STRATEGY 


Class 


Suppliers 


Stores 


Periods 


Percentage 


in seconds 




11 


1 






100 


65.21 


Classical 


12 


2 


10 


12 


100 


187.37 


Inventory/ 


13 


5 






100 


163.23 


Distribution 


14 


1 






99.946 


83.5 


Strategy 


15 


2 


20 


8 


100 


210.51 




16 


5 






99.953 


200.68 




Cl 


1 






100 


60.0 




C2 


2 


10 


12 


100 


174.13 


Crossdocking 


C3 


5 






100 


159.06 


Strategy 


C4 


1 






100 


79.73 




C5 


2 


20 


8 


100 


202.83 




C6 


5 






100 


186.0 


Direct 


D1 


1 






100 


51.23 


Shipments 


D2 


2 


12 


8 


100 


165.83 


Allowed 


D3 


5 






99.921 


117.27 



TABLE 11.6. Computational results for a single warehouse. 

periods following a normal distribution. Given deterministic SI, S, and T, and 
with no set-up costs, the level of safety stock that the facility needs to keep, see 
Exercise 11.1, is equal to 

zhV SI + T — S, 

where 2 is the safety stock factor associated with a specified level of service and 
h is the inventory holding cost. The value SI + T — S is referred to as the facility 

Net Lead Time. 

To understand our model, consider the following two stage supply chain with 
facility 2 feeding facility 1 which serves the end customer. Define Sli, Si, and T,; as 
before for i = 1,2. Thus, Si is the committed service time to the end customer, S 2 
is the commitment that facility 2 makes to facility 1 and hence S 2 = SI 1 . Finally, 
SI 2 is the supplier commitment to facility 2. 

Our objective is to minimize total supply chain cost without effecting, or push- 
ing, inventory to the supplier. Observe that if we reduce S 2 = SI 1 we will effect 
inventory at both facility 1 and facility 2. Indeed, by reducing the committed ser- 
vice time that facility 2 makes to facility 1, inventory at facility 1 is reduced but 
inventory at the second facility is increased. Thus, our objective is to develop a 
model that selects the appropriate level of commitment that one facility makes to 
its down stream facility so as to minimize total, or more precisely, system wide 
safety stock cost. 

For this purpose, consider a supply network G(N, A) which is acyclic, with N 
facilities and A is the set of edges. Let D C N be the set of customers, or demand 
points with Sj being an upper bound on the commitment to be made to customer 
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is 



IS’ 




stage 1 customers 



FIGURE 11.3. Illustration of the model 



j, j e D. 

Following our discussion, we formulate the problem of setting commitments and 
safety stock levels as the following non-linear optimization problem. 

N 

Problem SS : Min ZjhjQ(Xj) 

j = i 

s.t. 

Xj = SIj + Tj - Sj \/j = 1, . . . , N, 

Xj> 0, Vj = 1, . . . ,N, 

SIj — Si > 0, V(i, j) G A , 

Sj < Sj Vj G I) 

Sj,SIj > 0, = 

where = \fXj- 

Observe that in this formulation, there are two sets of decision variables. The 
first is the Sj, the commitment made by facility j to all its downstream facilities. 
The second is the implied in coming service time to facility j. This in coming 
service time is the maximum of the committed service time of all the up stream 
facilities feeding facility j. 

Thus, constraint (11.12) defines the net lead time at facility j. Constraint (11.14) 
forces the incoming service time for facility j to be no smaller than the commitment 
that each facility i with ( i,j ) G A makes to facility j. Finally, constraint (11.15) 
forces the commitment to the end customer to be no larger than the target. 

Of course, the challenge is to solve this formulation effectively. Graves and 
Willems (2000) proposed a dynamic programming algorithm while Magnanti, 
Shen, Shu, Simchi-Levi and Teo (2003) develop an efficient algorithm based on 
a similar approach to what is described earlier for the Shipper problem. 



( 11 . 12 ) 

(11.13) 

(11.14) 

(11.15) 

(11.16) 



11.4 Exercises 



Exercise 11.1. Consider a single product, single facility, infinite horizon, periodic 
review model. Assume that the inventory is managed based on a stationary base 
stock policy and unsatisfied demand is backlogged. At each period, demand arrives 
according to a normal distribution (Let’s assume that the probability of 
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having negative demand is negligible.) Let SI be the incoming service time, S be 
the committed service time and T be the processing time at the facility. Finally, 
assume that the initial inventory level equals the base stock level and assume the 
service level is a, which is defined as the probability that the demand can be 
fully satisfied from current on hand inventories. Show that the base stock level is 
( SI+T — S , )/i + 'F~ 1 (a)-\/S7 + T — Sa, where 1 is the inverse of the cumulative 
distribution function of the standard normal distribution. 

Exercise 11.2. Assume, in Problem SS of Section 11.3, that the supply network 
reduces to a serial supply chain. Show that Si = 0 or S t = ,S) + i + T), where 
stage i + 1 serves stage i. Based on this observation, propose an algorithm to solve 
Problem SS. 




12 

Facility Location Models 



12.1 Introduction 

One of the most important aspects of logistics is deciding where to locate new 
facilities such as retailers, warehouses or factories. These strategic decisions are a 
crucial determinant of whether materials will flow efficiently through the distribu- 
tion system. 

In this chapter we consider several important warehouse location problems: the 
p -Median Problem, the Single-Source Capacitated Facility Location Problem and 
a distribution system design problem. In each case, the problem is to locate a 
set of warehouses in a distribution network. We assume that the cost of locating 
a warehouse at a particular site includes a fixed cost (e.g., building costs, rental 
costs, etc.) and a variable cost for transportation. This variable cost includes the 
cost of transporting the product to the retailers as well as possibly the cost of 
moving the product from the plants to the warehouse. In general, the objective is 
to locate a set of facilities so that total cost is minimized subject to a variety of 
constraints which might include: 

• each warehouse has a capacity which limits the area it can supply. 

• each retailer receives shipments from one and only one warehouse. 

• each retailer must be within a fixed distance of the warehouse that supplies 
it, so that a reasonable delivery lead time is ensured. 

Location analysis has played a central role in the development of the operations 
research field. In this area lie some of the discipline’s most elegant results and 
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theories. We note here the paper of Cornuejols et al. (1977) and the two excel- 
lent books devoted to the subject by Mirclrandani and Francis (1990) and Daskin 
(1995). Location problems encompass a wide range of problems such as the loca- 
tion of emergency services (fire houses or ambulances), the location of hazardous 
materials, problems in telecommunications network design, etc. just to name a 
few. 

In the next section, we present an exact algorithm for one of the simplest location 
problems, the p -Median Problem. We then generalize this model and algorithm to 
incorporate additional factors important to the design of the distribution network, 
such as warehouse capacities and fixed costs. In Section 12.4, we present a more 
general model where all levels of the distribution system (plants and retailers) are 
taken into account when deciding warehouse locations. We also present an efficient 
algorithm for its solution. All of the algorithms developed in this chapter are based 
on the Lagrangian relaxation technique described in Chapter 5.3 which has been 
applied successfully to a wide range of location problems. Finally, in Section 12.5, 
we describe the structure of the optimal solution to problems in the design of 
large-scale logistics systems. 



12.2 An Algorithm for the p -Median Problem 

Consider a set of retailers geographically dispersed in a region. The problem is to 
choose where in the region to locate a set of p identical warehouses. We assume 
there are m > p sites that have been preselected as possible locations for these 
warehouses. Once the p warehouses have been located, each of n retailers will get 
its shipments from the warehouse closest to it. We assume: 

• there is no fixed cost for locating at a particular site, and 

• there is no capacity constraint on the demand supplied by a warehouse. 

Note that the first assumption also encompasses the case where the fixed cost is 
not site-dependent and therefore the fixed set-up cost for locating p warehouses is 
independent of where they are located. 

Let the set of retailers be N where N = {1,2,..., n}, and let the set of potential 
sites for warehouses be M where M = {1,2,..., m}. Let wy be the demand or flow 
between retailer i and its warehouse for each i £ N . We assume that the cost of 
transporting the wy units of product from warehouse j to retailer i is Cij, for each 
i £ N and j £ M. 

The problem is to choose p of the m sites where a warehouse will be located in 
such a way that the total transportation cost is minimized. This is the p -Median 
Problem. 

The continuous version of this problem, where any point is a potential warehouse 
location, was first treated as early as 1909 by Weber. The discrete version was 
analyzed by Kuehn and Hamburger (1963) as well as Hakimi (1964), Manne (1964), 
Balinski (1965) and many others. 
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We present here a highly effective approach to the problem. Define the following 
decision variables: 

f 1, if a warehouse is located at site j, 

Y i = \ 

( 0, otherwise, 

for j £ M, and 

{ 1, if retailer i is served by a warehouse at site j, 

j j, 

0, otherwise, 

for i £ N and j £ M. The p -Median Problem is then: 

n m 

Problem P : Min EE Cij X^j 

*= 1 J =1 



m 



' 52 x ij = 1, VieN 
j=l 


(12.1) 


m 

E Y i = P 

7 — 1 


(12.2) 


J — A 

Xij < Yj, Vi <E TV, j G M 


(12.3) 


Xij,Yj £ {0, 1}, Vi £ N, j £ M. 


(12.4) 



Constraints (12.1) guarantee that each retailer is assigned to a warehouse. Con- 
straint (12.2) ensures that p sites are chosen. Constraints (12.3) guarantee that a 
retailer selects a site only from among those that are chosen. Constraints (12.4) 
force the variables to be integer. 

This formulation can easily handle several side constraints. If a handling fee is 
charged for each unit of product going through a warehouse, these costs can be 
added to the transportation cost along all arcs leaving the warehouse. Also, if a 
particular limit is placed on the length of any arc between retailer i and warehouse 
j, this can be incorporated by simply setting the per unit shipping cost (c^) to 
+ 00 . In addition, the model can be easily extended to cases where a set of facilities 
are already in place and the choice is whether to open new facilities or expand the 
existing facilities. 

Let Z* be an optimal solution to Problem P. One simple and effective technique 
to solve this problem is the method of Lagrangian relaxation described in Chapter 
5.3. 

As described in Chapter 5.3, Lagrangian relaxation involves relaxing a set of 
constraints and introducing them into the objective function with a multiplier 
vector. This provides a lower bound on the optimal solution to the overall problem. 
Then, using a subgradient search method, we iteratively update our multiplier 
vector in an attempt to increase the lower bound. At each step of the subgradient 
procedure (i.e., for each set of multipliers) we also attempt to construct a feasible 
solution to the location problem. This step usually consists of a simple and efficient 
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subroutine. After a prespecified number of iterations, or when the solution found 
is within a fixed error tolerance of the lower bound, the algorithm is terminated. 

To solve the p-Median Problem, we choose to relax constraints (12.1). We in- 
corporate these constraints in the objective function with the multiplier vector 
A G M n . The resulting problem, call it P\, with optimal objective function value 
Z\, is: 

n m n 

Min 'y ' 'y ' CijXij + y ' A i 

i — 1 j = 1 i—1 

subject to (12.2) — (12.4). 

Disregarding constraint (12.2) for now, the problem decomposes by site, that is, 
each site can be considered separately. Let subproblem P x , with optimal objective 
function value Z 3 X , be the following. 

n 

Min y '(Cjj + A i)Xij 
»= l 

s.t. Xij < Yj, \/i £ N 
X i:j G {0, 1}, V* G N 
Yj G {0,1}. 



m 

3 = 1 



Solving Subproblem P 3 X 

Assume A is fixed. In Problem Pj[, site j is either selected (Yj = 1) or not 
(Yj =0). If site j is not selected, then X,j = 0 for all i G N and therefore Zl = 0. 
If site j is selected, then we set Yj = 1 and assign exactly those retailers i with 
Cij + A* < 0 to site j. In this case: 

n 

^ = E min{cjj + A,;, 0}. (12.5) 

We see that P 3 X is solved easily and its optimal objective function value is given 
by (12.5). 

To solve P\, we must now reintroduce constraint (12.2). This constraint forces 
us to choose only p of the m sites. In P\, we can incorporate this constraint by 
choosing the p sites with smallest values Z 3 X . To do this, let n be a permutation of 
the numbers 1 , 2 , . . . , to such that 

z *(i) < z x ^ < Z x ^ < . . . < 

Then the optimal solution to P\ has objective function value: 

p n 

3—1 3=1 
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The value Z\ is a lower bound on the optimal solution of Problem P for any 
vector A £ IR n . To find the best such lower bound, we consider the Lagrangian 
dual: 

max{Z\}. 

Using the subgradient procedure (described in Chapter 5.3), we can iteratively 
improve this bound. 

Upper Bounds 

It is crucial to construct good upper bounds on the optimal solution value as the 
subgradient procedure advances. Clearly, solutions to P\ will not necessarily be 
feasible to Problem P. This is due to the fact that the constraints (12.1) (that each 
retailer choose one and, only one warehouse) may not be satisfied. The solution to 
P\ may have facilities choosing a number of sites. If, in the solution to P\, each 
retailer chooses only one site, then this must be the optimal solution to P and 
therefore we stop. Otherwise, there are retailers that are assigned to several or no 
sites. A simple heuristic can be implemented which fixes those retailers that are 
assigned to only one site, and assigns the remaining retailers to these and other 
sites by choosing the next site to open in the ordering defined by 7 r. When p sites 
have been selected, a simple check that each retailer is assigned to its closest site 
(of those selected) can further improve the solution. 

Computational Results 

Below we give a table listing results of various computational experiments. The 
retailer locations were chosen uniformly over the unit square. For simplicity, we 
made each retailer location a potential site for a warehouse, thus m = n. The 
cost of assigning a retailer to a site was the Euclidean distance between the two 
locations. The values of u\ were chosen uniformly over the unit interval. We applied 
the algorithm mentioned above to many problems and recorded the relative error 
of the best solution found and the computation time required. The algorithm is 
terminated when the relative error is below 1% or when a prespecified number of 
iterations is reached. The numbers below “Error” are the relative errors averaged 
over ten randomly generated problem instances. The numbers below “CPU Time” 
is the CPU time averaged over the ten problem instances. All computational times 
are on an IBM Rise 6000 Model 950. 

Table 1: Computational results for the p -Median algorithm 



n 


P 


Error 


CPU Time 


10 


3 


0.3% 


0.2s 


20 


4 


1.7% 


2.6s 


50 


5 


1.4% 


20.7s 


100 


7 


1.3% 


87.7s 


200 


10 


2.4% 


715.4s 





204 



12. Facility Location Models 



12.3 An Algorithm for the Single-Source Capacitated 
Facility Location Problem 

Consider the p -Median Problem where we make the following two changes in our 
assumptions. 

• The number of warehouses to locate ( p ) is not fixed beforehand. 

• If a warehouse is located at site j: 

o a fixed cost fj is incurred, and 

o there is a capacity qj on the amount of demand it can serve. 

The problem is to decide where to locate the warehouses and then how the re- 
tailers should be assigned to the open warehouses in such a way that total cost 
is minimized. We see that the problem is considerably more complicated than 
the p -Median Problem. We now have capacity constraints on the warehouses and 
therefore a retailer will not always be assigned to its nearest warehouse. Allowing 
the optimization to choose the appropriate number of warehouses also adds to the 
level of difficulty. 

This problem is called the single-source Capacitated Facility Location Problem 
(CFLP), or sometimes the Capacitated Concentrator Location Problem (CCLP). 
This problem was successfully used in Chapter 14 as a framework for solving the 
Capacitated Vehicle Routing Problem. 

Using the same decision variables as in the p -Median Problem, we formulate 
the single-source CFLP as the following integer linear program. 

n m m 

Min y y ' CijXij + y ' fjYj 



3 = 1 J = 1 






XX- = 1 


Vi G N 


(12.6) 


3 = 1 






X! W ' X 'J ~ q 3 Y 3 
i=l 


V j G M 


(12.7) 


Xij , Yj G {0, 1} 


Vi G N, j G M. 


(12.8) 



Constraints (12.6) (along with the integrality conditions (12.8)) ensure that each 
retailer is assigned to exactly one warehouse. Constraints (12.7) ensure that the 
warehouse’s capacity is not exceeded, and also that if a warehouse is not located 
at site j, no retailer can be assigned to that site. 

Let Z* be the optimal solution value of single-source CFLP. Note we have re- 
stricted the assignment variables (A) to be integer. A related problem, where this 
assumption is relaxed, is simply called the (multiple-source) Capacitated Facility 
Location Problem. In that version, a retailer’s demand can be split between any 
number of warehouses. In the single-source CFLP, it is required that each retailer 
have only one warehouse supplying it. In many logistics applications, this is a 
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realistic assumption since without this restriction optimal solutions might have a 
retailer receive many deliveries of the same product (each for, conceivably, a very 
small amount of the product). Clearly, from a managerial, marketing and account- 
ing point of view, restricting deliveries to come from only one warehouse is a more 
appropriate delivery strategy. 

Several algorithms have been proposed to solve the CFLP in the literature; all 
are based on the Lagrangian relaxation technique. This includes Neebe and Rao 
(1983), Barcelo and Casanovas (1984), Klincewicz and Luss (1986) and Pirkul 
(1987). The one we derive here is similar to the algorithm of Pirkul which seems 
to be the most effective. 

We apply the Lagrangian relaxation technique by including constraints (12.6) 
in the objective function. For any A £ lR n , consider the following problem P\. 

n m m n m 

Min yy yy c ijXij + yy /? Yj + yy ( yy — i) 

*= 1 3= 1 3 = 1 *=1 j = l 

subject to (12.7) — (12.8). 

Let Z\ be its optimal solution and note that 

Z\ < Z*, VA g M n . 

To solve P\ , as in the p -Median Problem, we separate the problem by site. For 
a given j £ M, define the following problem Pi, with optimal objective function 
value Z J X : 

n 

Min + A i)X{j + fjYj 

i= 1 
n 

s.t. yy wjXjj < qjYj 

i = 1 

Xij G {0, 1} Vi G N 

Yj e {o,i}. 



Solving P x 

Problem P x can be solved efficiently. In the optimal solution to P x , Yj is either 
0 or 1. If Yj = 0, then Xjj = 0 for all i £ N. If Yj = 1, then the problem is no 
more difficult than a single constraint 0-1 Knapsack Problem, for which efficient 
algorithms exist; see, for example, Nauss (1976). If the optimal knapsack solution 
is less than —fj, then the corresponding optimal solution to P^ is found by setting 
Yj = 1 and X ij according to the knapsack solution, indicating whether retailer i 
is assigned to site j. If the optimal knapsack solution is more than — fj, then the 
optimal solution to P x is found by setting Yj = 0 and Xjj = 0 for all i £ N. 
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The solution to P\ is then given by 

m n 

SA = 5>j-5>. 

3 = 1 *=1 

For any vector A £ JR", this is a lower bound on the optimal solution Z* . In order 
to find the best such lower bound we use a subgradient procedure. 

Note that if the problem has a constraint on the number of warehouses (facilities) 
that can be opened (chosen), this can be handled in essentially the same way as 
it was handled in the algorithm for the p -Median Problem. 

Upper Bounds 

For a given set of multipliers, if the values {X} satisfy (12.6), then we have 
an optimal solution to Problem P, and we stop. Otherwise, we perform a simple 
subroutine to find a feasible solution to P. The procedure is based on the observa- 
tion that the knapsack solutions found when solving P\ give us some information 
concerning the benefit of setting up a warehouse at a site (relative to the current 
vector A). If, for example, the knapsack solution corresponding to a given site is 
0, that is, the optimal knapsack is empty, then this is most likely not a “good” 
site to select at this time. In contrast, if the knapsack solution has a very negative 
cost, then this is a “good” site. Given the values Zl for each j £ M, let tt be a 
permutation of 1, 2, . . . , m such that 

z Al) < z n(2) < ... < Z ^m) 

The procedure we perform allocates retailers to sites in a myopic fashion. Let 
M be the minimum possible number of warehouses used in the optimal solution 
to CFLP. This can be found by solving the Bin-Packing Problem defined on the 
values Wi with bin capacities Qj\ see Section 3.2. Starting with the “best” site, in 
this case site 7r(l), assign the retailers in its optimal knapsack to this site. Then, 
following the indexing of the knapsack solutions, take the next “best” site (say site 
j = 7t(2)) and solve a new knapsack problem: one defined with costs Cj? = Cjj + Aj 
for each retailer i still unassigned. Assign all retailers in this knapsack solution to 
site j. If this optimal knapsack is empty, then a warehouse is not located at that 
site, and we go on to the next site. Continue in this manner until M warehouses 
are located. 

The solution may still not be a feasible solution to P since some retailers may 
not be assigned to a site. In this case, unassigned retailers are assigned to sites 
that are already chosen where they fit with minimum additional cost. If needed, 
additional warehouses may be opened following the ordering of n. A local im- 
provement heuristic can be implemented to improve on this solution, using simple 
interchanges between retailers. 



Computational Results 
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We now report on various computational experiments using this algorithm. The 
retailer locations were chosen uniformly over the unit square. Again, for simplicity, 
we made each retailer location a potential site for a warehouse; thus, m = n. The 
fixed cost of a site was chosen uniformly between 0 and 1. The cost of assigning a 
retailer to a site was the Euclidean distance between the two locations. The values 
of u\ were chosen uniformly over the interval 0 to ^ with warehouse capacity equal 
to 1. We applied the algorithm mentioned above to ten problems and recorded the 
average relative error of the best solution found and the average computation time 
required. The algorithm is terminated when the relative error is below 1% or when 
a prespecified number of iterations is reached. The numbers below “Error” are the 
relative errors averaged over the ten randomly generated problem instances. The 
numbers below “CPU Time” is the CPU time averaged over the ten problem 
instances. All computational times are on an IBM Rise 6000 Model 950. 

Table 2: Computational results for the 
single-source CFLP algorithm 



n 


Error 


CPU Time 


10 


1.2% 


1.2s 


20 


1.0% 


8.1s 


50 


1.1% 


110.0s 


100 


1.1% 


558.3s 



12.4 A Distribution System Design Problem 

So far the location models we have considered have been concerned with minimiz- 
ing the costs of transporting products between warehouses and retailers. We now 
present a more realistic model that considers the cost of transporting the product 
from manufacturing facilities to the warehouses as well. 

Consider the following warehouse location problem. A set of plants and retailers 
are geographically dispersed in a region. Each retailer experiences demands for a 
variety of products which are manufactured at the plants. A set of warehouses 
must be located in the distribution network from a list of potential sites. 

The cost of locating a warehouse includes the transportation cost per unit from 
warehouses to retailers but also the transportation cost from plants to warehouses. 
In addition, as in the CFLP, there is a site-dependent fixed cost for locating each 
warehouse. 

The data for the problem are the following. 

• L = number of plants; we will also let L = {1, 2, . . . , L} 

• J = number of potential warehouse sites, also let J = {1, 2, . . . , J} 

• I = number of retailers, also let I = {1, 2, . . . , 1} 

• K = number of products, also let K = {1, 2, ... , K} 
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• W = number of warehouses to locate 

• Cljk = cost of shipping one unit of product k from plant £ to 

warehouse site j 

• djik = cost of shipping one unit of product k from warehouse 

site j to retailer i 

• fj = fixed cost of locating a warehouse at site j 

• Vik = supply of product k at plant £ 

• Wik = demand for product k at retailer i 

• Sfc = volume of one unit of product /c 

• ^ = capacity (in volume) of a warehouse at site j 

We make the additional assumption that a retailer gets delivery for a product 
from one warehouse only. This does not preclude solutions where a retailer gets 
shipments from different warehouses, but these shipments must be for different 
products. On the other hand, we assume that the warehouse can receive shipments 
from any plant and for any amount of product. 

The problem is to determine where to locate the warehouses, how to ship the 
product from the plants to the warehouses and also how to ship the product from 
the warehouses to the retailers. This problem is similar to one analyzed by Pirkul 
and Jayaraman (1996). 

We again use a mathematical programming approach. Define the following de- 
cision variables: 



f 1 , if a warehouse is located at site j, 
\ 0, otherwise, 



and 



Uijk — amount of product k shipped from plant l to warehouse j, 

for each £ £ L, j £ J and k £ K. Also define: 

{ 1, if retailer i receives product k from warehouse j, 

0, otherwise, 

for each j £ J, i £ I and k £ K. 

Then the Distribution System Design Problem can be formulated as the follow- 
ing integer program. 

L J K I J K j 

Min ZEE CijkUgjk + EEE djikWikXjik + Z fjYj 

i—\ j= 1 k — 1 i= 1 j= 1 k — 1 j — 1 
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J 



— 1 

4 — 1 


Vi G I, k G K 


(12.9) 


j — 

I K 


^ ^ ^ ^ ^k^ik-^-jik — QjYj 


Vj G J 


(12.10) 


2=1 k—1 

I L 


WikXjik — ^ ^ Utjk 


Vj £ J, k G K 


(12.11) 


i= i 


j 

Z Utjk < V£k 

3=1 


W G L, k G K 


(12.12) 


k 

II 

-W3 




(12.13) 


J — 

YjjXjik G {0, 1} 


Vi e I.j e .J. k g K 


(12.14) 


Utjk > o 


V£ G L,j G J,k G K. 


(12.15) 



The objective function measures the transportation costs between plants and ware- 
houses, between warehouses and retailers and also the fixed cost of locating the 
warehouses. Constraints (12.9) ensure that each retailer/product pair is assigned to 
one warehouse. Constraints (12.10) guarantee that the capacity of the warehouses 
is not exceeded. Constraints (12.11) ensure that there is a conservation of the flow 
of products at each warehouse; that is, the amount of each product arriving at a 
warehouse from the plants is equal to the amount being shipped from the ware- 
house to the retailers. Constraints (12.12) are the supply constraints. Constraints 
(12.13) ensure that we locate exactly W warehouses. 

The model can handle several extensions such as a warehouse handling fee or a 
limit on the distance of any link used just as in the p -Median Problem. Another 
interesting extension is when there are a fixed number of possible warehouse types 
from which to choose. Each type has a specific cost along with a specific capacity. 
The model can be easily extended to handle this situation (see Exercise 12.1). 

As in the previous problems, we will use Lagrangian relaxation. We relax con- 
straints (12.9) (with multipliers A $*,) and constraints (12.11) (with multipliers djj.). 
The resulting problem is: 

L J K J I K j 

Min ZEE CljkUgjk + EEE djikWikXjik + Z fKi 

t—1 j— 1 k—1 j—1 2—1 k—1 j—1 

J K I L IK J 

+ EE Ojk [E WikXjik ^ ^ EE ^ ik e^4 

j—1 k—1 2= 1 i—1 2=1 k—1 j= 1 

subject to (12.10), (12.12) - (12.15). 

Let Z\ g be the optimal solution to this problem. This problem can be decom- 
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posed into two separate problems P± and P 2 . They are the following. 

L J K 

Problem P 1 : Z x =Min ^ EE^lfc ~ 0 jk \U^ k 

t=l j= 1 k = 1 
J 

s.t. Uejk < W £ L, k £ K (12.16) 

i= i 

U(j k >0, W £ A, j £ J, k £ AT. 

j I K j 

Problem P 2 : Z 2 =Min ^2 EE[ rf Afc w »fc - + X! 

j — 1 2=1 /c— 1 J = 1 

J K 

EE Sk'WikXjik E QjYj) Mj £ J (12. IT) 

2—1 /c— 1 
J 

E y i = p ’ ( 12 - 18 ) 

t=i 

5 Xjik £ {0, 1}, V* € I, j & J, k £ A'. 



Solving Pi 

Problem Pi can be solved separately for each plant/product pair. In fact, the 
objective functions of each of these subproblems can be improved (without loss in 
computation time) by adding the constraints: 

L 

Sk ^2 U m < q j: Vj £ J, k £ K. (12.19) 

t=i 

For each plant/product combination, say plant £ and product k, sort the J values 
Cj = C£j k — Ojk . Starting with the smallest value of Cj, say Cy, if Cj> > 0, then the 
solution is to ship none of this product from this plant. If ciyk < 0, then ship as 
much of this product as possible along arc (£, j') subject to satisfying constraints 
(12.16) and (12.19). Then if the supply V( k has not been completely shipped, do 
the same for the next cheapest arc, as long as it has negative reduced cost (c). 
Continue in this manner until all of the product has been shipped or the reduced 
costs are no longer negative. Then proceed to the next plant/product combination 
repeating this procedure. Continue until all the plant/product combinations have 
been scanned in this fashion. 

Solving P 2 

Solving Problem P 2 is similar to solving the subproblem in the CFLP. For now 
we can ignore constraints (12.18). Then we separate the problem by warehouse. In 




12.4 A Distribution System Design Problem 211 



the problem corresponding to warehouse j, either Yj = 0 or Y 3 = 1. If Yj = 0, then 
Xjik = 0 for all* £ N and k £ K. If Y) = 1, then we get a Knapsack Problem with 
NK items, one for each retailer/product pair. Let Z 2 be the objective function 
value when Yj is set to 1 and the resulting knapsack problem is solved. After 
having solved each of these, let 7r be a permutation of the numbers 1, 2, . . . , J such 
that 

Z 2 {1) < Z 2 {2) <■■■< Z 2 {J) . 

The optimal solution to P 2 is to choose the W smallest values: 

w 

Z 2 = Y, Z 2 U) - 

3 = 1 

For fixed vectors A and 9 , the lower bound is 

1 K 

Z\fi = Z\ + Z 2 + Ajfc. 

i = 1 k= 1 

To maximize this bound, that is, 



max{Z x ,e}, 

X,0 

we again use the subgradient optimization procedure. 

Upper Bounds 

At each iteration of the subgradient procedure, we attempt to construct a fea- 
sible solution to the problem. Consider Problem P 2 . Its solution may have a re- 
tailer/product combination assigned to several warehouses. We determine the set 
of retailer/product combinations that are assigned to one and only one retailer and 
fix these. Other retailer/product combinations are assigned to warehouses using 
the following mechanism. For each retailer/product combination we determine the 
cost of assigning it to a particular warehouse. After determining that this assign- 
ment is feasible (from a warehouse capacity point of view), the assignment cost is 
calculated as the cost of shipping all of the demand for this retailer/product com- 
bination through the warehouse plus the cost of shipping the demand from the 
plants to the warehouse (along one or more arcs from the warehouse to the plants) . 
For each retailer/product combination we determine the penalty associated with 
assigning the shipment to its second best warehouse instead of its best warehouse. 
We then assign the retailer/product combination with the highest such penalty 
and update all arc flows and remaining capacities. We continue in this manner 
until all retailer/product combinations have been assigned to warehouses. 
Computational results for this problem appear at the end of Chapter 17. 
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12.5 The Structure of the Asymptotic Optimal Solution 

In this section we describe a region partitioning scheme to solve large instances of 
the CFLP. 

Assume there are n retailers located at points {xi,X2, ■ ■ ■ ,x n }. Each retailer 
also serves as a potential site for a warehouse of fixed capacity q. The fixed cost 
of locating a warehouse at a site is assumed to be proportional to the distance the 
site is from a manufacturing facility located at Xo which is assumed (without loss 
of generality) to be the origin (0,0). Retailer i has a demand Wj which is assumed 
to be less than or equal to q. Without loss of generality, we assume q = 1 and 
therefore Wi € [0,1] for each i G TV. Let a be the per unit cost of transportation 
between warehouses and the manufacturing facility, and let /3 be the per unit cost 
of transportation between warehouses and retailers. 

We assume the retailer locations are independently and identically distributed 
in a compact region A C 1R 2 according to some distribution p. Assume the retailer 
demands are independently and identically distributed according to a probability 
measure (f) on [0,1]. The bin-packing constant associated with the distribution cf> 
(denoted by 7^ or simply 7) is the asymptotic number of bins used per item in an 
optimal packing of the retailer demands into unit size bins, when items are drawn 
randomly from the distribution <f> (see Section 4.2). 

The following theorem shows that if the retailer locations and demand sizes 
are random (from a general class of distributions), then as the problem size in- 
creases, the optimal solution has a very particular structure. This structure can 
be exploited using a region partitioning scheme as demonstrated below. 

Theorem 12.5.1 Let Xk, k = 1,2, ...,n be a sequence of independent random 
variables having a distribution p with compact support in 1R 2 . Let ||a:|| be the Eu- 
clidean distance between the manufacturing facility and the point x € M 2 , and 
let 

E(d) = J \\x\\dp(x). 

Let the demands Wk, k = 1, 2, . . . , n be a sequence of independent random variables 
having a distribution with bin-packing constant equal to 7. Then, almost surely, 

lim —Zf l = 07 E(d). 

n— »■ 00 77, 

This analysis demonstrates that simple approaches which consider only the ge- 
ography and the packing of the demands can be very efficient on large problem 
instances. Asymptotically, this is in fact the optimal strategy. This analysis also 
demonstrates that, asymptotically, the cost of transportation between retailers and 
warehouses becomes a very small fraction (eventually zero) of the total cost. 
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Exercise 12.1. In the Distribution System Design Problem, explain how the 
solution methodology changes when there are a fixed number of possible warehouse 
capacities. For example, at each site, if we decide to install a warehouse, we can 
install a small , medium or large one. 

Exercise 12.2. Prove Theorem 12.5.1. 

Exercise 12.3. Show how any instance of the Bin-Packing Problem (see Part I) 
can be formulated as an instance of the Single-Source CFLP. 

Exercise 12.4. Consider Problem Pi of Section 12.4. 

(а) Show that this formulation can be strengthened by adding the constraints: 

L K 

EE SkUgjk — Qji Mj G J. 

e=i fc = i 

(б) Show that this new formulation can be transformed to a specialized kind of 
linear program called a transportation problem. 

(c) Why might we not want to use this stronger formulation? 

Exercise 12.5. (Mirchandani and Francis, 1990) Define the Uncapacitated Facil- 
ity Location Problem (UFLP) in the following way. Let Fj be the fixed charge of 
opening a facility at site j, for j = 1,2,..., in. 

n m m 

Problem UFLP : Min EE Cij X-ij H - Eee 

*= i j=i j= i 

m 

s.t.^2,X ij = 1, MiGN 

3 = 1 

Xij < Yj, Mi gN, j G M 

G {0,1}, MiGN,jGM. 

Show that UFLP is AfP-Hard by showing that any instance of the A/P-Hard 
Node Cover Problem can be formulated as an instance of UFLP. The Node Cover 
Problem is defined as follows: given a graph G and an integer k, does there exist 
a subset of k nodes of G that cover all the arcs of G? (Node v is said to cover arc 
e if v is an end-point of e.) 

Exercise 12.6. (Mirchandani and Francis, 1990) It appears that the p-Median 
problem can be solved by solving the resulting problem UFLP (see Exercise 12.5) 
for different values of F = Fj, Mj, until a value F* is found where the UFLP opens 
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exactly p facilities. Show that this method does not work by giving an instance 
of a 2-Median problem for which no value of F provides an optimal solution to 
UFCLP with two open facilities. 




Part IV 



VEHICLE ROUTING MODELS 
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The Capacitated VRP with Equal 
Demands 



13.1 Introduction 

A large part of many logistics systems involves the management of a fleet of vehicles 
used to serve warehouses, retailers and/or customers. In order to control the costs 
of operating the fleet, a dispatcher must continuously make decisions on how much 
to load on each vehicle and where to send it. These types of problems fall under 
the general class of Vehicle Routing Problems mentioned in Chapter 1. 

The most basic Vehicle Routing Problem (VRP) is the single-depot Capacitated 
Vehicle Routing Problem (CVRP). It can be described as follows: a set of customers 
has to be served by a fleet of identical vehicles of limited capacity. The vehicles are 
initially located at a given depot. The objective is to find a set of routes for the 
vehicles of minimal total length. Each route begins at the depot, visits a subset of 
the customers and returns to the depot without violating the capacity constraint. 

Consider the following scenario. A customer requests w units of product. If we 
allow this load to be split between more than one vehicle (i.e., the customer gets 
several deliveries which together sum up to the total load requested), then we can 
view the demand for w units as w different customers each requesting one unit of 
product located at the same point. The capacity constraint can then be viewed 
as simply the maximum number of customers (in this new problem) that can be 
visited by a single vehicle. This is the capacity Q > 1. Therefore, if we allow this 
splitting of demands, and this may not be a desirable property (we investigate the 
unsplit demand case in Chapter 14) , there is no loss in generality in assuming that 
each customer has the same demand, namely, one unit, and the vehicle can visit 
at most Q of these customers on a route. Therefore, this model is sometimes called 
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the CVRP with splittable demands or the ECVRP. 

We denote the depot by x 0 and the set of customers by N = {xi,X 2 , ... ,x n }. 
The set N 0 = 7VU{a;o} designates all customers and the depot. The customers and 
the depot are represented by a set of nodes on an undirected graph G = (N 0 ,E). 
We denote by di the distance between customer i and the depot, d max = maxieN di 
the distance from the depot to the furthest customer, and dij the distance between 
customer i and customer j. The distance matrix {d l:j } is assumed to be symmetric 
and satisfy the triangle inequality; that is, dij = dji for all i,j and dij < dik + dkj 
for all i,k,j. We denote the optimal solution value of the CVRP by Z* and the 
solution provided by a heuristic H by Z H . 

In what follows, the optimal traveling salesman tour plays an important role. 
So, for any set S C N 0 , let L*(S) be the length of the optimal traveling salesman 
tour through the set of points S. Also, let L a (S) be the length of an a-optimal 
traveling salesman tour through S, that is, one whose length is bounded from 
above by aL*(S), a > 1. 

The graph depicted in Figure 13.1 , which is denoted by Q{t, s), also plays an 
important role in our worst-case analyses. It consists of s groups of Q nodes and 
another s — 1 nodes, called white nodes, separating the groups. The nodes within 
the same group have zero interdistance and each group is connected to the depot 
by an arc of unit length. The white nodes are of zero distance apart and t units 
distance away from the depot. Each white node is connected to the two groups 
of nodes it separates by an arc of unit length. Note that when 0 < t < 2, Q{t, s) 
satisfies the triangle inequality (if an edge (i,j) is not shown in the graph, then 
the distance between node i and node j is defined as the length of the shortest 
path from i to j). Also note that whenever 0 < t < 2, the tour depicted in Figure 
13.2 is an optimal traveling salesman tour of length 2s. 

In this chapter, we analyze this problem using the two tools developed earlier, 
worst-case and average-case analyses. Later, in Chapter 14, we will analyze a more 
general model of the CVRP. 



13.2 Worst-Case Analysis of Heuristics 

A simple heuristic for the CVRP, suggested by Haimovich and Rinnooy Kan (1985) 
and later modified by Altinkemer and Gavish (1990), is to partition a traveling 
salesman tour into segments, such that each segment of customers is served by a 
single vehicle; that is, each segment has no more than Q points. The heuristic, 
called the Iterated Tour Partitioning (ITP) heuristic, starts from a traveling sales- 
man tour through all n = |iV| customers and the depot. Starting at the depot and 
following the tour in an arbitrary orientation, the customers and the depot are 
numbered A°\ al 1 \x^ 2 \ . . . , A”) where al 0 ) is the depot. We partition the path 
from al 1 ) to al”) into [0] (or [0] + 1) disjoint segments, such that each one con- 
tains no more than Q customers, and connect the end-points of each segment to 
the depot. The first segment contains only customer aAl All the other segments 
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FIGURE 13.1. Every group contains Q customers with interdistance zero. 



FIGURE 13.2. An optimal traveling salesman tour in Q(t,s). 
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contain exactly Q customers, except maybe the last one. This defines one feasible 
solution to the problem. We can repeat the above construction by shifting the end- 
points of all but the first and last segments up by one position in the direction of 
the orientation. This can be repeated Q — 1 times producing a total of Q different 
solutions. We then choose the best of the set of Q solutions generated. 

It is easy to see that, for a given traveling salesman tour, the running time 
of the ITP heuristic is 0(nQ). The performance of this heuristic clearly depends 
on the quality of the initial traveling salesman tour chosen in the first step of 
the algorithm. Hence, when the ITP heuristic partitions an a-optimal traveling 
salesman tour, it is denoted ITP (a). To establish the worst-case behavior of the 
algorithm, we first find a lower bound on Z* , and then calculate an upper bound 
on the cost of the solution produced by the ITP(a) heuristic. 

Lemma 13.2.1 Z* > max{L* (Nq) , di}- 



Proof. Clearly, Z* > L* ( N 0 ) by the triangle inequality. To prove Z* > ^ di, 

consider an optimal solution in which N is partitioned into subsets {Ni, N 2 , ■ . ■ , N m } 
where each set Nj is served by a single vehicle. Clearly, 






^L*(Ar j U{x 0 })>^2 

3 3 



max 

ieNj 



^ - e 

3 



2 

W\ 



E di 

iGNj 



> 



E^E* 

3 ^ iGNj 



2 

Q 
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Lemma 13.2.2 Z ITP (“) < | £ ie jv * + (1 - ±)aL*(N 0 ). 



Proof. We prove the lemma by finding the cumulative length of the Q solutions 
generated by the ITP heuristic. The z th solution consists of the segments: 



,(1) t (2) 



M) 






di+1) ~(*+ 2) 



r (*+Q)l 






p(j+l+L Ii Q 1 JQ) 



r («) 



}• 



Thus, among the Q solutions generated, each customer x^> , 2 < i < n— 1 appears 
exactly once as the first point of a segment and exactly once as the last point. 
Therefore, in the cumulative length of the Q solutions the term 2 d x o) is incurred 
for each i, 2 < i < n — 1. Customer is the first point of a segment in each 
of the Q solutions, and in the first one it is also the last point. Thus, the term 
d x (i) appears Q + 1 times in the cumulative length. Similarly, x ^ is always the 
last point of a segment in each of the Q solutions, and once the first point. Thus, 
the term d x (n) appears Q + 1 times in the cumulative length as well. Finally, each 
one of the arcs (x^\ for 1 < i < n — 1 appears in exactly Q — 1 solutions 

since it is excluded from only one solution. These arcs, together with the Q — 1 
arcs connecting the depot to x ^ and Q — 1 arcs connecting the depot to x^ n \ 
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form Q-l copies of the initial traveling salesman tour selected in the first step 
of the heuristic. Thus, if the initial traveling salesman tour is an cr-optimal tour, 
the cumulative length of all Q tours is 

2j2di + (Q~l)L a (N 0 ) 

i£N 

<2^d i + (Q-l)aL*(N 0 ). 

i£N 

Hence, 

Z ITP(a) < § $> + ( l-haL*{N 0 ). 



Combining upper and lower bounds, we obtain the following result. 

Theorem 13.2.3 



Z ITP( 0 



< 1 + 1 - 



H) 



a. 



(13.1) 



For example, if Christofides’ polynomial-time heuristic (a = 1.5) is used to 
obtain the initial traveling salesman tour, we have 



Z ITP(1.5) 



7 * — 



5 3 

< 2~ 2 Q' 



The proof of the worst-case result for the ITP(a) heuristic suggests that if we 
can improve the bound in (13.1) for a = 1, then the bound can be improved for 
any a > 1. However, the following theorem, proved by Li and Simchi-Levi (1990), 
says that this is impossible; that is, the bound 



z itp(i) 1 

<2 

Z* ~ Q 



is sharp. 



Theorem 13.2.4 For any integer Q > 1, there exists a problem instance with 
Z^^/Z* =2- i. 



Proof. Let us consider the graph 1/(0, q). A solution obtained by the ITP heuristic 
is shown in Figure 13.3. In this solution, 

Z ITP( 1} = 2 + 2 + 4 + 4 + -- -+ 4 +2 = 40 — 2. 

s V- ' 

Q — 2 times 

One can construct a solution that has Q vehicles serve the Q groups of customers 
and the ( Q + l) st vehicle serve the other <3 — 1 nodes. Thus, 



Z* < 2Q. 
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FIGURE 13.3. Solution obtained by the ITP heuristic 



Hence, 

^ITP(l) x 

>2 . 

Z* ~ Q 

This together with the upper bound of (13.1) completes the proof. I 

Another variant of the tour partitioning heuristic is the Optimal Partitioning 
(OP) heuristic described by Beasley (1983). The algorithm takes a traveling sales- 
man tour and optimally partitions it into a set of feasible routes; that is, each 
route contains at most Q customers. 

Given a traveling salesman tour through the customers and the depot, the points 
are numbered . . . ,x in order of appearance on the tour, where x is 

the depot. Define 

{ the distance traveled by a vehicle that starts at x^ visits, 
customers x^ +1 \x^ +2 \ . . . , x ^ and returns to x^°\ if k — j < Q; 

oo, otherwise. 



If we find the shortest path from x^ to x ^ in the acyclic graph (with nodes 
arW, 0 < i < n, and arcs (x^\x^) for 0 < i < j < n) where the distance between 
and x ^ is Cjk, we will have an optimal partition of the traveling salesman tour 
into feasible routes. For example, if the shortest path from to x is x^ — * 
x W — » x ^ — > x^ then three tours are formed, namely, (x^^x^, . . . ,x^\x^), 
(a;W, jjP+U 5 j;P+ 2 ) , . . . , x^ u \ arW) and (x^, x ^ u ~^^ , x^ u ^ 2 ^ , ... ,x^ n \ x^). 

For a given traveling salesman tour, the above shortest path problem can be 
solved in O(nQ) time including the time required to evaluate the costs Cjk- 
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When the OP heuristic partitions an a-optimal traveling salesman tour, it is 
denoted OP (a). The partitions considered by the OP (a) heuristic include all Q 
of the partitions generated by the ITP(a) heuristic. Therefore, Z OP < Z ITP 
and hence its worst-case bound is at least as good; that is, 



Z o p (“) 

~Z* 



sl + ( 1 -£)“- 



The next theorem implies that for a = 1 this bound is asymptotically sharp; that 
is, Z OP P) jZ* tends to 2 when Q approaches infinity. 



Theorem 13.2.5 For any integer Q > 1, there exists a problem instance with 
Z OP W jz* arbitrarily close to 2 — 



Proof. Consider the graph Q(l,Kq+ 1), where K is a positive integer. It is easy 
to check that 

Z° p P) = 2 (KQ + 1) + 2 KQ. 

On the other hand, consider the solution in which KQ+1 vehicles serve the KQ+ 1 
groups of customers and another K vehicles serve the other nodes. Hence, 

Z* < 2(KQ + 1) + 2 A", 



and therefore, 



lim 

K — >oo 



z OP(l) 

Z* 



> 2 - 



2 

Q + 1 



13.3 The Asymptotic Optimal Solution Value 

In the following two sections, we assume that the customers are points in the plane 
and that the distance between any pair of customers is given by the Euclidean 
distance. Assume without loss of generality that the depot is the point (0,0) and 
| |a;| | designates the distance from the depot to the point x € JR 2 . The results 
discussed in this section and the next are mainly based on Haimovich and Rinnooy 
Kan (1985). 

The upper bound of Lemma 13.2.2 has two cost components; the first component 
is proportional to the total “radial” cost between the depot and the customers. 
The second component is proportional to the “circular” cost: the cost of traveling 
between customers. This cost is related to the cost of the optimal traveling sales- 
man tour. As discussed in Chapter 2, for large n, the cost of the optimal traveling 
salesman tour grows like y/n, while the total radial cost between the depot and 
the customers grows like n. Therefore, it is intuitive that when the number of 
customers is large enough the first cost component will dominate the second. This 
observation is now formally proven. 
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Theorem 13.3.1 Let Xk, k = 1,2 be a sequence of independent random 
variables having a distribution p with compact support, in 1R 2 . Let 



E(d) = 




x\\dp(x). 



Then, with probability one, 



ry* 

lim - — 

n—> oo 77, 



2 

Q 



E(d). 



Proof. Lemma 13.2.1 and the strong law of large numbers tell us that 

Z* 2 

lim — > —E(d) ( a.s .). 

n — >oo n Q 

On the other hand, from Lemma 13.2.2, 



^ITP(l) 

— < 

n n 



< 



2 

nQ 



^2 di + (i - 



ieN 



l\ L*(N 0 ) 
QJ n 



(13.2) 



From Chapter 4, we know that there exists a constant (3 > 0, independent of the 
distribution /./ , such that with probability one, 

lim L *^ =(3 f f 1 / 2 (x)dx, 

n —>00 y/n J R 2 

where / is the density of the absolutely continuous part of the distribution p. 
Hence, 

Z* 2 

lim — < —E(d) (a.s.). 

n— >oo n Q 

This together with (13.2) proves the Theorem. I 

The following observation is in order. Haimovich and Rinnooy Kan prove The- 
orem 13.3.1 merely assuming E(d) is finite rather than the stronger assumption 
of a compact support. However, the restriction to a compact support seems to be 
satisfactory for all practical purposes. The following is another important general- 
ization of Theorem 13.3.1. Assume that a cluster of customers (rather than a 
single customer) is located at point Xk, k = 1,2 , ... ,n. The theorem then becomes 

lim — = —E(w)E(d), (13.3) 

n — >oo n Q 

where E(w) is the expected cluster size, provided that the cluster size is indepen- 
dent of the location. This follows from a straightforward adaptation of Lemma 
13.2.1 and Lemma 13.2.2. 
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13.4 Asymptotically Optimal Heuristics 

The proof of the previous Theorem (Theorem 13.3.1) reveals that the ITP(a) 
heuristic provides a solution whose cost approaches the optimal cost when n tends 
to infinity. Indeed, replacing ITP(l) by ITP(a) in the previous proof gives the 
following theorem. 

Theorem 13.4.1 Under the conditions of Theorem 13.3.1 and for any fixed a > 
1, the ITP(a) heuristic is asymptotically optimal. 

As is pointed out by Haimovich and Rinnooy Kan (1985), iterated tour par- 
titioning heuristics, although asymptotically optimal, hardly exploit the special 
topological structure of the Euclidean plane in which the points are located. It is 
therefore natural to consider Region Partitioning (RP) heuristics that are more 
geometric in nature. 

Haimovich and Rinnooy Kan consider three classes of regional partitioning 
schemes. In Rectangular Region Partitioning (RRP), one starts with a rectan- 
gle containing the set of customers N and cuts it into smaller rectangles. In Polar 
Region Partitioning (PRP) and Circular Region Partitioning (CRP), one starts 
with a circle centered at the depot and partitions it by means of circular arcs and 
radial lines. We shall shortly discuss each one of these in detail. 

In each case the RP heuristics construct subregions of the plane, where subregion 
j contains a set of customers N(j). These subregions are constructed so that each 
one of them has exactly Q customers except possibly one. 

Since every subset N(j ) has no more than Q customers, each of these RP heuris- 
tics allocates one vehicle to each subregion. The vehicles then use the following 
routing strategy. The first customer visited is the one closest to the depot among 
all the customers in iV(j'). The rest are visited in the order of an a-optimal travel- 
ing salesman tour through N(j). After visiting all the customers in the subregion 
the vehicle returns to the depot through the first (closest) customer. It is therefore 
natural to call these heuristics RP(a) heuristics. In particular we have RRP(a), 
PRP(a) and CRP(a). 

Lemma 13.4.2 Z RP ^ ^ Z) i6Ar dj + 2d max + a L*(iV(j)). 

Proof. We number the subsets N (j) constructed by the RP(a) heuristic so that 
|lV(j)| = Q for every j > 2 and |7V(1)| < Q. It follows that the total distance 
traveled by the vehicle that visits subset N(j), for j > 2, is 

<2 min di + aL*(N{j)) 

ieN(j) 

E <k + aL*(N(j)), 

^ i£N(j) 



while the total distance traveled by the vehicle that visits N(l) is no more than 

2d max + aL* (N (1)) . 
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Taking the sum over all subregions we obtain the desired result. I 

The quality of the upper bound of Lemma 13.4.2 depends, of course, on the 
quantity L*(N(j)). This value was analyzed in Chapter 4 where it was shown 
that for any RP heuristic 

J2l*(N(j))<L*(N) + ^P rp , (13.4) 

3 

where P RP is the sum of perimeters of the subregions generated by the RP heuris- 
tic. For this reason we analyze the quantity P RP in each of the three region 
partitioning heuristics. 



Rectangular Region Partitioning (RRP) 



This heuristic is identical to the one introduced for the Traveling Salesman 
Problem in Section 4.3. The smallest rectangle with sides a and b containing the 
set of customers N is partitioned by means of horizontal and vertical lines. First, 
the region is subdivided by t vertical lines such that each subregion contains exactly 
(h+l)Q points except possibly the last one. Each of these t- 1-1 subregions is then 
partitioned by means of h horizontal lines into h + 1 smaller subregions such that 
each contains exactly Q points except possibly for the last one. 

As before, h and t should satisfy 

r n "I 

t= (h+l)Q ~ ’ 



and 



t(h + 1 )Q <n <(t + l)(h + 1 )Q. 

The unique integer that satisfies these conditions is h = |" — 1] . Note that the 



number of vertical lines added is t < and each of these lines is counted twice 
in the quantity P RRP . 

In the second step of the RRP we add h horizontal lines where h < These 
horizontal lines are also counted twice in P RRP . It follows that 



n 



pRRP < 2 J P (a + 6) + 2(a + b) < 8d n 

v 



8d n 



Polar Region Partitioning (PRP) 



The circle with radius d m ax containing the set N and centered at the depot is 
partitioned in exactly the same way as in the previous partitioning scheme, with 
the exception that circular arcs and radial lines replace vertical and horizontal 
lines. Using the same analysis, one can show: 



P PRP < 67tcL 




2d n 



(13.5) 
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Circular Region Partitioning (CRP) 

This scheme partitions the circle centered at the depot with radius d max into ft 
equal sectors, where ft is to be determined. Each sector is then partitioned into 
subregions by means of circular arcs, such that each subregion contains exactly <5 
customers except possibly the one closest to the depot. Thus, at most ft subregions, 
each from one sector, have less than Q customers. These subregions (with the depot 
on their boundary) are then repartitioned by means of radial cuts such that at 
most ft. — 1 of them have exactly Q customers each except for possibly the last one. 

The total length of the initial radial lines is ftd max . The length of an inner 
circular arc bounding a subregion containing a set N(j) is no more than 

2tt . 2n SiewH) ^ J2ieN(j) di 

"ft ™NU) dt - T \N(j)\ ~ hQ ’ 

while the length of the outer circle is 27rft max . Finally, the repartitioning of the 
central subregions adds no more than . Thus, 

P CRP < 2 (ftft max + 2?r ^g Af dl + + 2tt d max . 



Taking ft 




we obtain the following upper bound on P CRP , 



pCRP < 4 / 37rd max — + (3 + 27r)ft max . 

V ^ ieN 

The reader should be aware that all of these partitioning schemes can be im- 
plemented in 0(nlog?r) time. We now have all the necessary ingredients for an 
asymptotic analysis of the performance of these partitioning heuristics. 

Theorem 13.4.3 Under the conditions of Theorem 13.3.1 and for any fixed a > 
1, RRP(a), PRP(a) and CRP(a) are asymptotically optimal. 



Proof. Lemma 13.4.2 together with equation (13.4) provide the following upper 
bound on the total distance traveled by all vehicles in the solution produced by 
the above RP heuristics. 

z R p (a) < 1 Y, di + 2d m ax + aL*(N) + \aP RP . 

^ i£N 

By the strong law of large numbers and the fact that the distribution has compact 
support, 1 di converges almost surely to E(d) while — ^ converges almost 

surely to 0. Furthermore, L ^ converges to 0 almost surely; see the proof of The- 
orem 13.3.1. Finally, from the analysis of each of the region partitioning heuristics 
and the fact that the points are in a compact region, ftft— converges almost surely 
to zero as well. I 
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In conclusion, we see that the CVRP with equal demands is asymptotically solv- 
able via several different region partitioning schemes. In fact, since each customer 
has the same demand, the packing of the customers’ demands into the vehicles 
is a trivial problem. Any Q customers can fit. The more difficult problem, when 
demands are of different sizes, presents complicating bin-packing features which 
will prove to be more difficult. 



13.5 Exercises 



Exercise 13.1. Consider the following version of the Capacitated Vehicle Routing 
Problem (CVRP). You are given a network G = (V, A) with positive arc lengths. 
Assume that EGA is a given set of edges that have to be “covered” by vehicles. 
The vehicles are initially located at a depot p € V. Each vehicle has a “capacity” 
q; that is, each vehicle can cover no more than q edges from E. Once a vehicle 
starts an edge in E it has to cover all of it. The objective is to design tours for 
vehicles so that all edges in E are covered, vehicles’ capacities are not violated and 
total distance traveled is as small as possible. 

(a) Suppose we want first to find a single tour that starts at the depot p, traverses 
all edges in E and ends at p whose total cost (length) is as small as possible. 
Generalize Christofides’ heuristic for this case. 

( b ) Consider now the version of the CVRP described above and suggest two 
possible lower bounds on the optimal cost of the CVRP. 

(c) Describe a heuristic algorithm based on a tour partitioning approach using, 
as the initial tour, the tour you found in part (a). What is the worst-case bound 
of your algorithm? 

Exercise 13.2. Derive equation (13.3). 

Exercise 13.3. Consider an n customer instance of the CVRP with equal de- 
mands. Assume there are m depots and at each depot is an unlimited number of 
vehicles of limited capacity. Suggest an asymptotically optimal region partitioning 
scheme for this case. 

Exercise 13.4. Consider an n customer instance of the CVRP with equal de- 
mands. There are K customer types: a customer is of type k with independent 
probability pk > 0. Customers of different types cannot be served together in the 
same vehicle. Devise an asymptotically optimal heuristic for this problem. If K is 
a function of n, what conditions on K{n) are necessary to ensure that this same 
heuristic is asymptotically optimal? 



Exercise 13.5. Derive equation (13.5). 
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The Capacitated VRP with Unequal 
Demands 



14.1 Introduction 

In this chapter we consider the Capacitated Vehicle Routing Problem with unequal 
demands (UCVRP). In this version of the problem, each customer i has a demand 
Wi and the capacity constraint stipulates that the total amount delivered by a 
single vehicle cannot exceed Q. We let Z* denote the optimal solution value of 
UCVRP, that is, the minimal total distance traveled by all vehicles. 

In this version of the problem, the demand of a customer cannot be split over sev- 
eral vehicles; that is, each customer must be served by a single vehicle. This, more 
general version of the model, is sometimes called the CVRP with unsplit demands. 
The version where demands may be split is dealt with in Chapter 13. Splitting a 
customer’s demand is often physically impossible or managerially undesirable due 
to customer service or accounting considerations. 



14.2 Heuristics for the CVRP 

A great deal of work has been devoted to the development of heuristics for the 
UCVRP; see, for example, Christohcles (1985), Fisher (1995), Federgruen and 
Simchi-Levi (1995) or Bertsimas and Simchi-Levi (1996). Following Clrristohdes, 
we classify these heuristics into the 4 categories: 

• Constructive Methods 



• Route First-Cluster Second Methods 
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• Cluster First-Route Second Methods 

• Incomplete Optimization Methods. 

We will describe the main characteristics of each of these classes and give ex- 
amples of heuristics that fall into each. 

Constructive Methods 

The Savings Algorithm suggested by Clarke and Wright (1964) is the most im- 
portant member of this class. This heuristic, which is the basis for a number of 
commercial vehicle routing packages, is one of the earliest heuristics designed for 
this problem and, without a doubt, the most widely known. The idea of the savings 
algorithm is very simple: consider the depot and n demand points. Suppose that 
initially we assign a separate vehicle to each demand point. The total distance 
traveled by a vehicle that visits demand point i is 2 di, where di is the distance 
from the depot to demand point i. Therefore, the total distance traveled in this 
solution is 2 ]C”=i di. 

If we now combine two routes, say we serve i and j on a single trip (with the 
same vehicle), the total distance traveled by this vehicle is di + dij + dj, where dij 
is the distance between demand points i and j. Thus, the savings obtained from 
combining demand points i and j, denoted $ij , is: 

Sij = 2d, + 2 dj — (di + dj + dij) = di + dj — d ,, . 

The larger the savings Sij, the more desirable it is to combine demand points i 
and j. Based on this idea, Clarke and Wright suggest the following algorithm. 

The Savings Algorithm 

Step 1: Start with the solution that has each customer visited by a separate vehicle. 

Step 2: Calculate the savings Sij = doi + djo — <: kj > 0 for all pairs of customers i 
and j. 

Step 3: Sort the savings in nonincreasing order. 

Step 4-' Find the first feasible arc (i, j) in the savings list where 

1) i and j are on different routes, 

2) both i and j are either the first or last visited on their 

respective routes, and 

3) the sum of demands of routes i and j is no more than Q. 

Add arc (i,j) to the current solution and delete arcs (0, i) and (j, 0). Delete 
arc (i,j) from the savings list. 

Step 5: Repeat step 4 until no more .arcs ,satisfy the conditions., . . 

Additional constraints, which might be present, can easily be incorporated into 

Step 4. Usually a simple check can be performed to see whether combining the 

tours containing i and j violates any of these constraints. 
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Other examples of heuristics that fall into this class are the heuristics of Gaskel 
(1967), Yellow (1970) and Russell (1977). In particular the first two are modifica- 
tions of the Savings algorithm. 

Route First-Cluster Second Methods 

Traditionally, this class has been defined as follows. The class consists of those 
heuristics that first construct a traveling salesman tour through all the customers 
(route first) and then partition the tour into segments (cluster second) . One vehicle 
is assigned to each segment and visits the customers according to their appearance 
on the traveling salesman tour. 

As we shall see in the next section some strong statements can be made about 
the performance of heuristics of this class. For this purpose, we give a more precise 
definition of the class here. 

Definition 14.2.1 A heuristic is a route first-cluster second heuristic if it first 
orders the customers according to their locations, disregarding demand sizes, and 
then partitions this ordering to produce feasible clusters. These clusters consist of 
sets of customers that are consecutive in the initial order. Customers are then 
routed within their cluster depending on the specific heuristic. 

This definition of the class is more general than the traditional definition given 
above. The disadvantage of this class, of which we will give a rigorous analysis, 
can be highlighted by the following simple example. Consider a routing strategy 
that orders the demands in such a way that the sequence of demand sizes in the 
order is (9, 2, 9, 2, 9, 2, 9, 2, . . .). If the vehicle capacity is 10, then any partition of 
this tour must assign one vehicle to each customer. This solution would consist of 
half of the vehicles going to pick up two units (using 20% of the vehicle capacity) 
and returning to the depot; not a very efficient strategy. By contrast, a routing 
strategy that looks at the demands at the same time as it looks at customer 
locations would clearly find a more intelligent ordering of the customers: one that 
sequences demands efficiently to decrease total distance traveled. 

The route first-cluster second class includes classical heuristics such as the Opti- 
mal Partitioning heuristic introduced by Beasley (1983), and the Sweep algorithm 
suggested by Gillett and Miller (1974). 

In the Optimal Partitioning heuristic, one tries to find an optimal traveling 
salesman tour, or, if this is not possible, a tour that is close to optimal. This 
provides the initial ordering of the demand points. The ordering is then partitioned 
in an efficient way into segments. This step can be done by formulating a shortest 
path problem. See Section 13.2 for details. 

In the Sweep algorithm, an arbitrary demand point is selected as the starting 
point. The other customers are ordered according to the angle made between them, 
the depot and the starting point. Demands are then assigned to vehicles following 
this initial order. In effect, the points are “swept” in a clockwise direction around 
the depot and assigned to vehicles. Then efficient routes are designed for each 
vehicle. Specifically, the Sweep algorithm is the following. 
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The Sweep Algorithm 

Step 1 : Calculate the polar coordinates of all customers where the center is the depot 
and an arbitrary customer is chosen to be at angle 0. Reorder the customers 
so that 

0 = 6i < 0 2 < • • • < On- 

Step 2: Starting from the unrouted customer i with smallest angle 9i construct a new 
cluster by sweeping consecutive customers i + 1, i + 2 ... until the capacity 
constraint will not allow the next customer to be added. 

Step 3: Continue Step 2 until all customers are included in a cluster. 

Step 4'- For each cluster constructed, solve the TSP on the subset of customers and 
the depot. 

In both of these methods additional constraints can easily be incorporated into 
the algorithm. 

We note that, traditionally, researchers have classified the Sweep algorithm as a 
cluster first-route second method and not as a route first-cluster second method. 
Our opinion is that the essential part of any vehicle routing algorithm is the 
clustering phase of the algorithm, that is, how the customers are clustered into 
groups that can be served by individual vehicles. The specific sequencing within 
a cluster can and, for most problems, should be done once these clusters are 
determined. Therefore, a classification of algorithms for the CVRP should be solely 
based on how the clustering is performed. Thus, the Sweep algorithm can be 
viewed as an algorithm of the route first-cluster second class since the clustering 
is performed on a fixed ordering of the nodes. 

Cluster First-Route Second Methods 

In this class of heuristics, the clustering is the most important phase. Customers 
are first clustered into feasible groups to be served by the same vehicle (cluster 
first) without regard to any preset ordering and then efficient routes are designed 
for each cluster (route second). 

Heuristics of this class are usually more technically sophisticated than the pre- 
vious class, since determining the clusters is often based on a mathematical pro- 
gramming approach. This class includes the following three heuristics: 

• The Two-Phase Method (Christofides et al., 1978) 

• The Generalized Assignment Heuristic (Fisher and Jaikumar, 1981) 

• The Location-Based Heuristic (Bramel and Simchi-Levi, 1995) 

The first two heuristics use, in a first step, the concept of seed customers. The 
seed customers are customers that will be in separate vehicles in the solution, 
and around which tours are constructed. In both cases, the performance of the 
algorithm depends highly on the choice of these seeds. Placing the CVRP in the 
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framework of a different combinatorial problem, the Location-Based Heuristic se- 
lects the seeds in an optimal way and creates, at the same time, tours around 
these seeds. Thus, instead of decomposing the process into two steps, as done in 
the Two-Plrase Method and the Generalized Assignment Heuristic, the Location- 
Based Heuristic simultaneously picks the seeds and designs tours around them. 
We will discuss this heuristic in detail in Section 14.7. 

Incomplete Optimization Methods 

These methods are optimization algorithms that, due to the prohibitive com- 
puting time involved in reaching an optimal solution, are terminated prematurely. 
Examples of these include: 

• Cutting Plane Methods (Cornuejols and Harche, 1993) 

• Minimum K-Tree Methods (Fisher, 1994). 

The disadvantage of incomplete optimization methods is that they still require 
large amounts of processing time; they can handle problems with usually no more 
than 100 customers. 



14.3 Worst-Case Analysis of Heuristics 

In the worst-case analysis presented here, we assume that the customer demands 
Wi, u> 2 , • • • , w n and the vehicle capacity Q are rationals. Hence, without loss of 
generality, Q and Wi are assumed to be integers. Furthermore, we may assume that 
Q is even; otherwise one can double Q as well as each Wi, i = 1, 2, . . . , n, without 
affecting the problem. The following two-plrase route first-cluster second heuristic 
was suggested by Altinkemer and Gavish (1987). In the first phase, we relax the 
requirement that the demand of a customer cannot be split. Each customer i is 
replaced by wy unit demand points that are zero distance apart. We then apply 
the ITP(a) heuristic (see Section 13.3) using a vehicle capacity of In the second 
phase, we convert the solution obtained in Phase I to a feasible solution to the 
original problem without increasing the total cost. This heuristic is called the 
Unequal- Weight Iterated Tour Partitioning (UITP(a)) heuristic. 

We now describe the second phase procedure. Our notation follows the one sug- 
gested by Haimovich et al. (1988). Let m = J2 ieN w i be the number of demand 
points in the expanded problem. Recall that in the first phase an arbitrary orienta- 
tion of the tour is chosen. The customers are then numbered x ^ , A 1 ) , A 2 ) , . . . , A") 
in order of their appearance on the tour, where A 0 ) is the depot. The ITP(a) 
heuristic partitions the path from A 1 ) to A") into (or \^~\ + 1) disjoint 

segments such that each one contains no more than ^ demand points and con- 
nects the end-points of each segment to the depot. The segments are indexed by 
j = 1,2, ... , [q=p], such that the first customer of the j th segment is A 6 ^ and the 
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last customer is a;( e A Hence, the j th segment, denoted by Sj , includes customers 
{#^0, • • • , x Obviously, if = X t b i+P for some j, then the demand of cus- 
tomer x < - ej ' > is split between the j th and (j + l) th segments; therefore, these are 
not feasible routes. On the other hand, if x^') ^ x^ bj+1 ' ) for all j, then the set of 
routes is feasible. 

We now transform the solution obtained in the first phase into a feasible solution 
without increasing the total distance traveled. We use the following procedure. 

The Phase Two Procedure 

Step 1: Set Sj = 0, for j = 1,2,..., 

Step 2: For j = 1 to Tyf 1 - 1 do 
If x ^ = x^+^ then 

If Y^= b ) w x( i '> < Q then let Sj = {x^,- ■ ■ , x^} and 
let x (6 J+ l) = x ( ^+ 1+1) 

else let Sj = {x^, ■ ■ ■ , and x^ +1 ) = x hh) 

else, let St = {x^\ ■ ■ ■ , x^}. 

We argue that the procedure generates feasible sets Sj for j = 1, 2, ... , |"^p] . 
Note that the j th set can be enlarged only in the ( j — l) st and j th iterations (if at 
all). Moreover if it is enlarged in the j th iteration, it is clearly done feasibly in view 
of the test w xW — Q- On the other hand, if Sj is enlarged in the (j — l) st 

iteration, at most § demand points are added thus ensuring feasibility. This can be 
verified as follows. Assume to the contrary that in the (j — l) st iteration more than 
^ demand points are transferred from S , ] _ 1 to Sj so that in the (j — l) st iteration 
x ( e i-i) = x t b i) . Since the original set Sj-i contains at most ^ demand points we 
must have shifted demand points in the (j — 2) nd iteration from Sj- 2 to Sj - 1 (and 
in particular x^ bj ~^ = x( e J- 2 )), part of which are now being transferred to Sj. 
This implies that x W = x ( e 3- 2 ) = x ( b i~ 1 '> = = x^ bi \ where ej- 2 ,bj-i,ej-i 

and bj refer to the original sets Sj- 2 , Sj- 1 and Sj. In other words at the beginning 
of the (j — l) st iteration the set Sj_ 1 contains a single customer x^*\ But then, 
shifting x^ = x ^ backwards to St_ 1 is feasible, contradicting the fact that more 
than ^ demand points need to be shifted forward from Sj_ 1 to Sj. Therefore, the 
procedure generates feasible sets and we have the following worst case bound. 

Theorem 14.3.1 z — < 2 + (1 — ^ )a . 

Proof. Recall that in the first phase the vehicle capacity is set to ^ . Hence, using 
the bound of Lemma 13.2.2 we obtain the following upper bound on the length of 
the tours generated in Phase I of the UITP(a) heuristic, 

|£rf«+(i-§K(AW. 



(14.1) 
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In the second phase of the algorithm, the tour obtained in the first phase is con- 
verted into a feasible solution with total length no more than (14.1). To verify 
this, we need only to analyze those segments whose end-points are modified by 
the procedure. 

Suppose that Sj and Sj differ in their starting point; then Sj must start with 
x ( b j+ 1 ). This implies that arc (x^), x^j+P), which is part of the Phase I solution, 
does not appear in the j th route. The triangle inequality ensures that the sum of 
the length of arcs (x^°\x^ b S) and (x^),x^ + b) is no smaller than the length of 
arc (x^°\ A similar argument can be applied if Sj and Sj differ in their 

terminating point. Consequently, for every segment j, for j = 1,2, ... , the 

length of the j th route according to the new partition is no longer than the length 
of the j th route according to the old partition. Hence, 



z uitp(«) < i_ Y^ djWi + (l - |)aL*(iV 0 ). 



ieN 



Q / 



Clearly, Z* > Z * , and therefore using the lower bound on Z* developed in Lemma 
13.2.1 completes the proof. I 

The UITP heuristic was divided into two phases to prove the above worst-case 
result. However, if the Optimal Partitioning heuristic is used in the unequal weight 
model, the actual implementation is a one-step process. This is done as follows. 
Given a traveling salesman tour through the set of customers and the depot, we 
number the nodes x , x^P , . . . , x ^ in order of their appearance on the tour where 
is the depot. We then define a distance matrix with cost Cjk, where 



' the distance traveled by a vehicle that starts 
at x(°\ visits customers x^ +1 \x^ +2 \ . . . , x ^ 



Cjk = < and returns to x^°\ 



if £i=j+i w xW < Q\ 



I oo, otherwise. 

As in the equal demand case (see Section 13.2), it follows that a shortest path from 
to x ("i in the directed graph with distance cost Cjk corresponds to an optimal 
partition of the traveling salesman tour. This version of the heuristic, developed 
by Beasley and called the Unequal- Weight Optimal Partitioning (UOP) heuristic, 
also has Z UOP i“i/Z* < 2 + (1 — ^)a. The following theorem, proved by Li and 
Simchi-Levi (1990), implies that when a = 1, this bound is asymptotically tight 
as Q approaches infinity. 

Theorem 14.3.2 For any integer Q > 1, there exists a problem instance with 
^uop(i )/z* (and therefore Z V1TP ^ /Z*) arbitrarily close to 3 — 



Proof. We modify the graph Q( 2, Kq+ 1), where K is a positive integer, as follows. 
Every group now, instead of containing Q customers, contains only one customer 
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with demand Q. The other I\ Q customers have unit demand. The optimal traveling 
salesman tour is again as shown in Figure 13.2, and the solution obtained by the 
UOP(l) heuristic is to have 2KQ + 1 vehicles, each one of them serving only one 
customer. Thus 

z v°P(i) = 2( ^ K q + i) + 4 KQ. 

The optimal solution to this problem has KQ + 1 vehicles serve those customers 
with demand Q , and K other vehicles serve the unit demand customers. Hence, 

Z* = 2 (KQ + 1) + 4 Ah 



Therefore, 



lim 

K—*oo 



^UOP(l) 



2 (KQ + 1) + 4 KQ 
k^-oo 2(KQ + 1) + 4 AT 



= 3- 



6 

Q + 2 



14.4 The Asymptotic Optimal Solution Value 

In the probabilistic analysis of the UCVRP we assume, without loss of generality, 
that the vehicles’ capacity Q equals 1, and the demand of each customer is no 
more than 1. Thus, vehicles and demands in a capacitated vehicle routing problem 
correspond to bins and item sizes (respectively) in a Bin-Packing Problem. Hence, 
for every routing instance there is a unique corresponding bin-packing instance. 

Assume the demands w±,W 2 , ■ ■ ■ ,w n are drawn independently from a distribu- 
tion <I> defined on [0, 1]. Assume customer locations are drawn independently from 
a probability measure p with compact support in IR 2 . We assume that di > 0 for 
each i £ N since customers at the depot can be served at no cost. In this section 
we find the asymptotic optimal solution value for any $ and any p. This is done by 
showing that an asymptotically optimal algorithm for the Bin-Packing Problem, 
with item sizes distributed like 4), can be used to solve, in an asymptotic sense, 
the UCVRP. 

Given the demands W\,W 2 , -- ,w n , let 6* be the number of bins used in the 
optimal solution to the corresponding Bin-Packing Problem. As demonstrated in 
Theorem 4.2.4 there exists a constant 7 > 0 (depending only on 4>) such that 

lim — = 7 ( a.s .). (14.2) 

n— ► OO Tl 

We shall refer to the constant 7 as the bin-packing constant and omit the depen- 
dence of 7 on <f> in the notation. 

The following theorem was proved by Simchi-Levi and Bramel (1990). Recall, 
without loss of generality the depot is positioned at (0,0) and ||x|| represents the 
distance from the point x € M 2 to the depot. 
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Theorem 14.4.1 Let Xk, k = 1,2 be a sequence of independent random 
variables having a distribution p with compact support, in IB? . Let 

E(d)= [ \\x\\dp(x). 

Jwt 2 

Let the demands w k, k = 1, 2, . . . , n be a sequence of independent random variables 
having a distribution 4> with support, on [0, 1] and assume that, the demands and the 
locations of the customers are independent, of each other. Let. 7 be the bin-packing 
constant, associated with the distribution 4>; then, almost surely, 

lim -Z* = 2'yE(d). 

n— »■ 00 71 

Thus, the theorem fully characterizes the asymptotic optimal solution value of 
the UCVRP, for any reasonable distributions <I> and p. An interesting observation 
concerns the case where the distribution of the demands allows perfect, packing, 
that is, when the wasted space in the bins tends to become a small fraction of the 
number of bins used. Formally, 4> is said to allow perfect, packing if almost surely 
lim ra _ >00 = E{w). Karmarkar (1982) proved that a nonincreasing probability 
density function (with some mild regularity conditions) allows perfect packing. 
Rhee (1988) completely characterizes the class of distribution functions $ which 
allow perfect packing. Clearly, in this case 7 = E(w). Thus, Theorem 14.4.1 indi- 
cates that allowing the demands to be split or not does not change the asymptotic 
objective function value. That is, the UCVRP and the ECVRP can be said to be 
asymptotically equivalent, when $ allows perfect packing. 

To prove Theorem 14.4.1, we start by presenting in Section 14.4.1 a lower bound 
on the optimal objective function value. In Section 14.4.2, we present a heuristic 
for the UCVRP based on a simple region partitioning scheme. We show that the 
cost of the solution produced by the heuristic converges to our lower bound for 
any 4> and p, thus proving the main theorem of the section. 



14.4.1 A Lower Bound 



We introduce a lower bound on the optimal objective function value Z*. Let A C 
1R 2 be the compact support of p and define d max = sup xej 4 {||a;||}. For a given 
fixed positive integer r > 1 , partition the circle with radius d max centered at the 
depot into r rings of equal width. Let dj = (j — 1) for j = 1, 2, . . . , r, r + 1, 
and construct the following 2 r sets of customers: 



Si = 



x k G N 



< 



dk < d j+ 1 1 



for j = 1, . . . , r, 



and 

r 

Fj = (J Si for j = l,2,...,r. 

i=j 

Note that F r C F r _i C • • • C Fi = N since > 0 for all y G N. 
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In the lemma below, we show that Iv| grows to infinity almost surely as n 
grows to infinity. This implies that Fj also grows to infinity almost surely for 
j = 1,2 , . . . , r, since \Fj +1 \ < \Fj\, for j = 1,2 , . . . , r — 1. The proof follows from 
the definitions of compact support and d max - 

Lemma 14.4.2 

\ Fr \ 

lim — — = p ( a.s .) for some constant p > 0 . 

n— »■ oo 77, 

For any set of customers T C N, let b*(T) be the minimum number of vehicles 
needed to serve the customers in T; that is, b*(T) is the optimal solution to the 
Bin-Packing Problem defined by item sizes equal to the demands of the customers 
in T. We can now present a family of lower bounds on Z* that hold for different 
values of r > 1. 

Lemma 14.4.3 

Z* > 2^T b* (Fj) for any r > 1. 

T 3 = 2 

Proof. Given an optimal solution to the UCVRP, let K* be the number of vehicles 
in the optimal solution that serve at least one customer from S r , and for j = 
1, 2, . . . , r — 1, let K* be the number of vehicles in the optimal solution that serve 
at least one customer in the set Sj, but do not serve any customers in F ]+ \. Also, 
let V* be the number of vehicles in the optimal solution that serve at least one 
customer in F r By these definitions, V* = -X*, for j = 1,2, ...,r; hence, 

K* = V* - V* +l for j = 1, 2, . . . , r - 1 and K* = V r *. 

Note that Vf > b*(Fj), for j = 1,2 , ,r, since Vf represents the number of 

vehicles used in a feasible packing of the demands of customers in Fj, while b*(Fj) 
represents the number of bins used in an optimal packing. 

By the definition of K* and dj, Z* > 2 'fZ'j-i djK* and therefore, 

1 1 

z:>2d r v; + Y^ j (v;-v; +1 ) 

3=1 

r 

= 2d 1 V 1 * + Y2(d j ~d j _ 1 )V* 

3 = 2 
r 

= 2 y^(d 7 - — dj^Vj (since d x = 0) 

3 = 2 
r 

>2Y(d J ~d J _ 1 )b*(F j ) 

3 = 2 

= 2Y—b*(F j ). 

3 = 2 r 



(since V* > b*[Fj)) 
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Note that Lemma 14.4.3 provides a deterministic lower bound; that is, no prob- 
abilistic assumptions are involved. Lemma 14.4.2 and Lemma 14.4.3 are both re- 
quired to provide a lower bound on yZ* that holds almost surely. 

Lemma 14.4.4 Under the conditions of Theorem lf.f.l, we have 

lim —Z* u > 2 'yE(d) ( a.s .). 

n—> oo ^ 



Proof. Lemma 14.4.3 implies that 



1 






lim -Z* > 2^^ lim V 

n — >-oo ^ ^ n —> oo . 0 ^ 

3 = 2 



= 2 d I n™y' lim Wl Km J^i 

T n n — >-oo -Tj n— »oo Tl 

j=2 J] 



From Lemma 14.4.2, | Fj\ grows to infinity almost surely as n grows to infinity, 
for j = 1, 2, . . . , r. Moreover, since demands and locations are independent of each 
other, the demands in Fj, j = 1,2 , ,r are distributed like $. Therefore, 



lim , = lim | | =7 (a.s.). 

n^oo \Fj\ \Fj |— »oo \Fj\ 



Hence, almost surely 



Since 



Km -Z* u > 2^V 7 lim ^ 

n —> oo ^ T . 0 n —> oo ^ 

3 = 2 

= 2^ 7 lim -Y\FA. 
r oo n Z-* 

j= 2 



= |J 5 < for j = 1,2, .... 



we have \Fj\ = . |Sj|; hence, almost surely 



Hm \zf > 2 d ’" ax 7 lim 

n — >-oo /*' ' n—>o o >b. 

j=2 i-j 

= ^!ta iEO-DISi 

' n— ► oo ib . „ 

J=2 
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By the definition of dj , 



Hm I Z* > 2 7 fim lY.djlSj | = 2 7 fim * ^<^1, 



since d ± = 0 and |Si| < n. By the definition of dj and Sj, dj > dk ~ for all 
Xk G Sj. Then almost surely 



lim —Z*>2'y lim — (d^ — max ) 

n —> 00 ^ n — >-oo ^ , r T 

Xk£N 

= 2y lim — dk — 2y max 
n — >00 n r 



= 2 1 E{d) - 27 



Xk£N 

d m . 



This lower bound holds for arbitrarily large r; hence, 

lim — Z* > 2^E(d) ( a.s 

n — »oo ^ 



In the next section we show that this lower bound is tight by presenting an 
upper bound on the cost of the optimal solution that asymptotically approaches 
the same value. 



14-4-2 An Upper Bound 

We prove Theorem 14.4.1 by analyzing the cost of the following three-step heuristic 
which provides an upper bound on Z*. In the first step, we partition the area A 
into subregions. Then, for each of these subregions, we find the optimal packing of 
the customers’ demands in the subregion, into bins of unit size. Finally, for each 
subregion, we allocate one vehicle to serve the customers in each bin. 

The Region Partitioning Scheme 

For a fixed h > 0, let G{h) be an infinite grid of squares with side and edges 
parallel to the system coordinates. Recall that A is the compact support of the 
distribution function fi, and let A\. A^. ■ . ■ . A t ( hj be the intersection of the squares 
of G(h) with the compact support A that have jfiAj) > 0. Note t(h) < 00 since A 
is compact and t(h) is independent of n. 

Let N(i) be the set of customers located in subregion Aj, and define n(i) = 
|lV(i)|. For every i = 1, 2, . . . , t(h), let b*(i) be the minimum number of bins 
needed to pack the demands of customers in N(i). Finally, for each subregion Aj, 
i = 1 , 2 ,..., t(h), let nj(i) be the number of customers in the j th bin of this optimal 
packing, for each j = 1 , 2 ,..., b*(i). 
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We now proceed to find an upper bound on the value of our heuristic. Recall 
that for each bin produced by the heuristic, we send a single vehicle to serve all the 
customers in the bin. First, the vehicle visits the customer closest to the depot in 
the subregion to which the bin belongs, then serves all the customers in the bin in 
any order, and the vehicle returns to the depot through the closest customer again. 
Let d(i) be the distance from the depot to the closest customer in N(i), that is, in 
subregion A*. Note that since each subregion A, is a subset of a square of side 
the distance between any two customers in Ai is no more than h . Consequently, 
using the method just described, the distance traveled by the vehicle that serves 
all the customers in the j th bin of subregion A, is no more than 

2 d(i) + h(n.j(i ) + 1). 

Therefore, 

t(h)b* (i) £(/i) 

2 d L (i) + h(rij(i) + 1)J < 2 ^^b*(i)d,(i) + 2 nh. (14.3) 

i= 1 j= 1 i= 1 

This inequality will be coupled with the following lemma to find an almost sure 
upper bound on the cost of this heuristic. 

Lemma 14.4.5 Under the conditions of the Theorem lf.f.l, we have 

t{h) 

lim —’S^b*(i)d(i) < 7 E(d) ( a.s .). 

n— »■ 00 Tl ' ^ 
i= 1 



Proof. Let pi = p(Ai) be the probability that a given customer Xk falls in subregion 
A,. Since Pi > 0, by the strong law of large numbers, lim,^,*, £ 1 = pi almost 
surely and therefore n(i) grows to infinity almost surely as n grows to infinity. 
Thus, we have 



6*(0 b*(i) , , 

hm — — = Inn — — = 7 (a.s.). 

n^oo n(l) n(i)—>oo n(l) 



Hence, 



_ 1 Kh) . t(h) ... 

lim — y ^b*(i)d(i)= lim — — £ n(i)d(i ) 

n—> 00 n n—>oo n n(i) 

i=l i—1 v ' 

1 dC 7 * ( • \ 

/ TTZ37 1 b M . 



< lim — y — — - V dk (since d(i) < dk,Vxk G N(i)) 
n— >00 n z — ' nil) z — ' 

i=l x k eN(i ) 



i=l x k eN(i ) 

7 * / *\ -1 

= y lim lim I y d k 

L ' n—>oo n(l) n-»oon z — J 
i= 1 v ' x k eN(i ) 



= 7 lim - y d k . 

n . — >00 77. z J 
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Using the strong law of large numbers, we have 

t(h) 

lim — "S^b* (i)d(i) < 'yE(d) ( a.s .), 

n — >-oo 77, • ^ 
i = 1 

which completes the proof of this lemma. I 

Remark: A simple modification of the proof of Lemma 14.4.5 shows that the in- 
equality that appears in the statement of the lemma can be replaced by equality 
(see Exercise 14.5). 

We can now finish the proof of the Theorem 14.4.1. From equation (14.3) we 
have 

1 2 m 

-Z* u <-J2b*(i)dW + 2h. 

n n z — ' 

2=1 

Taking the limits and using Lemma 14.4.5, we obtain 

lim — Z* < 27 E(d) + 2 h (a.s.). 

n—> 00 77, 

Since this inequality holds for arbitrarily small ft, > 0, we have 

lim — Z* < 2"fE(d) (a.s.). 

n— »■ oo 77, 

This upper bound combined with the lower bound of Lemma 14.4.4 proves the 
main theorem. 



14.5 Probabilistic Analysis of Classical Heuristics 

Recently, Bienstock et al. (1993) analyze the average performance of heuristics that 
belong to the route first-cluster second class. Recall our definition of this class: all 
those heuristics that first order the customers according to their locations and 
then partition this ordering to produce feasible clusters. 

It is clear that the UITP(a) and UOP(a) heuristics described in Section 14.3 
belong to this class. As mentioned in Section 14.2, the Sweep algorithm suggested 
by Gillett and Miller can also be viewed as a member of this class. 

Bienstock et al. show that the performance of any heuristic in this class is 
strongly related to the performance of a nonefficient bin-packing heuristic called 
Next-Fit (NF). The Next-Fit bin-packing heuristic can be described in the fol- 
lowing manner. Given a list of n items, start with item 1 and place it in bin 1. 
Suppose we are packing item j; let bin i be the highest indexed nonempty bin. If 
item j fits in bin i, then place it there; else place it in a new bin indexed i + 1. 
Thus, NF is an online heuristic; that is, it assigns items to bins according to the 
order in which they appear without using any knowledge of subsequent items in 
the list. 
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The NF heuristic possesses some interesting properties that will be useful in 
the analysis of the class route first-cluster second. Assume the items are indexed 
1,2 ,n and let a consecutive heuristic be one that assigns items to bins such 
that items in any bin appear consecutively in the sequence. The following is a 
simple observation. 

Property 14.5.1 Among all consecutive heuristics, NF uses the least number of 
bins. 

The next property is similar to a property developed in Section 4.2 for &*, the 
optimal solution to the Bin-Packing Problem. 

Property 14.5.2 Let the item sizes W\,W 2 , ■ ■ ■ , u> n , ... in the Bin-Packing Prob- 
lem be a sequence of independent random variables and let b^ F be the number of 
bins produced by NF on the items 1,2, ... , n. For every t > 0 

Pr{\b* F - E(b* F ) | > t] < 2exp(— f 2 /8n). (14.4) 

A direct result of this property is the following. The proof is left as an exercise 
(Exercise 14.2). 

Corollary 14.5.3 For any n > 1, 

bn F < E(b^ F ) + d\Jn log n ( a.s .). 

The next property is a simple consequence of the theory of subadditive processes 
(see Section 4.2) and the structure of solutions generated by NF. 

Property 14.5.4 For any distribution of item sizes, there exists a constant y NF > 
6 nf 

0 such that linin^oo = y NF almost surely, where b^ F is the number of bins 
produced by the NF packing and y NF depends only on the distribution of the item 
sizes. 

These properties are used to prove the following theorem, the main result of this 
section. 

Theorem 14.5.5 (i) Let FI be a route first- cluster second heuristic. Then, under 
the assumptions of Theorem lf.f.l, we have 

lim -Z H > 2 7 NF £(d) (a.s.). 

n — >oo Tl 

(ii) The UOP(a) heuristic is the best possible heuristic in this class; that is, for 
any fixed a > 1 we have 

lim -Z VOP ^ = 2 7 NF E(d) (a.s.). 

n— ► OO Tl 
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In view of Theorems 14.4.1 and 14.5.5 it is interesting to compare 7 NF to 7 since 
the asymptotic error of any heuristic H in the class of route first-cluster second 

Qfl't'lQnPCl 

lim Z n /Z* u > lim Z vov(a) /Z* u = 7 NF /7- 

n —> 00 n —> 00 

Although in general the ratio is difficult to characterize, Karmarkar was able to 
characterize it for the case when the item sizes are uniformly distributed on an 
interval (0, a] for 0 < a < 1. For instance, for a satisfying ^ < a < 1, we have 

^ NF/7 = ;{ i (15 “ 3 - 9 “ 2 + - 1 ) + )}• 

so that when the item sizes are uniform (0, 1] the above ratio is | which implies 
that UOP(a) converge to a value which is 33.3 % more than the optimal cost, a 
very disappointing performance for the best heuristic currently available in terms 
of worst-case behavior. 

Moreover, heuristics in the route first-cluster second class can never be asymp- 
totically optimal for the UCVRP, except in some trivial cases (e.g., demands are 
all the same size). In fact, Theorem 14.5.5 clearly demonstrates that the route 
first-cluster second class suffers from misplaced priorities. The routing (in the first 
phase) is done without any regard to the customer demands and thus this leads 
to a packing of demands into vehicles that is at best like the Next-Fit bin-packing 
heuristic. This is clearly suboptimal in all but trivial cases, one being when cus- 
tomers have equal demands, and thus we see the connection with the results of 
the previous chapter. Therefore, this theorem shows that an asymptotically op- 
timal heuristic for the UCVRP must use an asymptotically optimal bin-packing 
heuristic to pack the customer demands into the vehicles. 

In the next two subsections we prove Theorem 14.5.5 by developing a lower 
bound (Section 14.5.1) on Z H and an upper bound on ^ UOP (“) (Section 14.5.2). 



14-5.1 A Lower Bound 



In this section, we present a lower bound on the solution produced by these heuris- 
tics. Let H denote a route first-cluster second heuristic. 

As in Section 14.4.1, let A be the compact support of the distribution /x, and 
define d max = sup a , 6A {||x||}. Given a fixed integer r > 1, define dj = (j — l)^ 31 
for j = 1,2 , ... , r, and construct the following r sets of customers: 



Fj = 



Xk G N 



d 0 < 



dfcj 



for j = 1 , . . . , r. 



Note that F r C F r _ 1 C ... C F \ , and F\ = N since, without loss of generality, 
dk > 0 for all Xk G N. 

Let the customers be indexed xi, X 2 , ■ ■ ■ , x n according to the order determined 
by the heuristic H in the route-first phase. 

For any set of customers T C N, let 6 NF (T) be the number of bins generated by 
the Next-Fit heuristic when applied to the Bin-Packing Problem defined by item 
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sizes equal to the demands of the customers in T, packed in the order of increasing 
index. 

Lemma 14.5.6 For any r > 1, 

Z H > 2^t^6 N F(F i ). 
r 3 = 2 

Proof. For a given solution constructed by H, let V(Fj) be the number of vehicles 
that serve at least one customer in Fj, for j = 1,2, ...,r. By this definition, 
V(Fj) — V(Fj + 1 ), j = 1, 2, . . . , r— 1 is exactly the number of vehicles whose farthest 
customer visited is in Fj but not in Fj + 1 , and trivially V(F r ) is the number of 
vehicles whose farthest customer visited is in F r . Hence, 

r — 1 

z n > 2 d r V(F r ) + ]T 2 dj ( V(Fj ) - V{F j+1 j) 

i= i 

r 

= 2d 1 V(F 1 ) + J2 2 (dj - dj_ 1 )V(F j ). 

3 = 2 

For a given subset of customers Fj, j = l,2,...,r, the V{Fj) vehicles that 
contain these customer demands (in the solution produced by H) can be ordered 
in such a way that the customer indices are in increasing order. Disregarding the 
demands of customers in these vehicles that are not in Fj, this represents the 
solution produced by a consecutive packing heuristic on the demands of customers 
in Fj. By Property 14.5.1 we must have V (Fj) > b NF (Fj), for every j = 1, 2, . . . , r. 
This, together with d ± = 0, dj — ( l 3 _ l = imply that 




j = 2 



This lemma is used to derive an asymptotic lower bound on the cost of the solu- 
tion produced by H that holds almost surely. The proof of the lemma is identical 
to the proof of Lemma 14.4.4. 

Lemma 14.5.7 Under the conditions of Theorem lf.f.l, we have 
lim — > 2'y NF E(d) ( a.s .). 

n—> oo 

In the next section we show that this lower bound is asymptotically tight in the 
case of UOP(a) by presenting an upper bound that approaches the same value. 
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14-5.2 The UOP(a) Heuristic 

We prove Theorem 14.5.5 by finding an upper bound on Let L a be 

the length of the a-optimal tour selected by UOP(a). Starting at the depot and 
following the tour in an arbitrary orientation, the customers and the depot are 
numbered x^ a \ x^\ x^ 2 \ . . . ,x^ n \ where x ^ is the depot. Select an integer m = 
\n f3 '\ for some fixed (5 £ ( 5 , 1) and note that for each such (3 we have Hindoo ^ = 0 
(i.e., m = o(n)) and lim^oo ^ = 0 (i.e., y/n = o(m)). We partition the path 
from x W to x ^ into to + 1 segments, such that each one contains exactly j 
customers, except possibly the last one. 

Number the segments 1,2,..., ?n+l according to their appearance on the travel- 
ing salesman tour, where each segment has exactly |_^J customers except possibly 
segment to + 1. Let L, (respectively, Ni) be the length of (respectively, subset of 
customers in) segment i, 1 < i < m + 1. Finally, let n» = |lVj|, * = 1, 2, . . . , m + 1. 

To obtain an upper bound on the cost of UOP(a), we apply the Next-Fit heuris- 
tic to each segment separately, where items are packed in bins in the same order 
they appear in the segment. This gives us a partition of the tour that must provide 
an upper bound on the cost produced by UOP(a). Let &^ F be the number of bins 
produced by the Next-Fit heuristic when applied to the customer demands in seg- 
ment i. We assign a single vehicle to each bin produced by the above procedure, 
each of which starts at the depot, visits the customers assigned to its correspond- 
ing bin in the same order as they appear on the traveling salesman tour, and then 
returns to the depot. Let d, be the distance from the depot to the farthest cus- 
tomer in Ni. Clearly, the total distance traveled by all the vehicles that serve the 
customers in segment i, 1 < i < m + 1, is no more than 

2 bf F dt + Li- 



Hence, 



ra +1 



z VOP(a) < 2 d i + L ° 



i = 1 



< 2 b? F di + 2bll ( 14 -5) 



i = 1 



Lemma 14.5.8 Under the conditions of Theorem 14-4- 1> we have 



Illy 

lim — y ^bf F di < 7 NF E(d) (a.s.). 

n— >oo Tl ^ J 
i = 1 



Proof. Since the number of customers in every segment i, 1 < i < to, is exactly 
rij = |_^J and lmpi—^ — = 0, we have for a given i, 1 < i < to, 



bf F < E(bf F ) + \jQKn.i log n-i 



(a.s.) 
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for any K > 2. 

We now show that, for sufficiently large n, these m inequalities hold simul- 
taneously almost surely. To prove this, note that Property 14.5.2 tells us that, 
for rii large enough, the probability that one such inequality does not hold is no 
more than 2exp (— K logrq) = 2 nf K . Thus, the probability that at least one of 
these inequalities is violated is no more than 2 m( — — 1)~ K . By the Borel-Cantelli 
Lemma, these in inequalities hold almost surely if Yh n m { ™ ) K < oo. Choosing 
K > > 3 shows that this holds for any in = \n^~\ where \ < (3 < 1. 

Thus, 

lim — y bf F di < y NF lim y —di ( a.s .). 

n— >oo fl y n— >oo ' 777, 

i = 1 1 

Clearly, di < dk + Li for every Xk G N t and every i = 1, 2, . . . , ro. Thus, 



for every i = 1, 2 . . . , m. 



T dk + lim — L a 

' n—>oo m 

i= 1 xicGN 



< lim y dk + a lim — L . 

n—> oo n — m n^oo m. 

XkGN 



Applying the strong law of large numbers and using limn—,,*, — = 0, we have 

lim V dk = E(d) (a.s.). 

n — >oo n — m 

XkGN 



di < y — ^ ' dkj + Li 



Xk£Ni 



Hence, 



m i _ i 

lim y — di < lim 

n — »oo in n — >oc n — 171 



Now from Chapter 4, we know that the length of the optimal traveling salesman 
tour through a set of k points independently and identically distributed in a given 
region grows almost surely like y/k. This together with linin^^ y = 0 implies 
that 

L* 

lim — = 0 (a.s.). 

n — >oo m 

These facts complete the proof. I 

We can now complete the proof of Theorem 14.4.1. From (14.5) and Lemma 4.1 
we have 



lim — ^u° p (“) < 2y F E(d) + 2d max lim — b^_ 1 + a lim — L* (a.s.). 



n—*oo n 



n—> oo 77 



n —* oo fl 



Finally, using Beardwoocl et al.’s (1959) result (see Theorem 4.3.2), and the fact 
that the number of points in segment in + 1 is at most — , we obtain the desired 
result. 
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14.6 The Uniform Model 

To our knowledge, no polynomial time algorithm that is asymptotically optimal 
is known for the UCVRP for general 4>. We now describe such a heuristic for the 
case where 4> is uniform on the interval [0,1]. In the unit interval, it is known that 
there exists an asymptotically optimal solution to the Bin-Packing Problem with 
at most two items per bin. This forms the basis for the heuristic for the UCVRP, 
called Optimal Matching of Pairs (OMP). It considers only feasible solutions in 
which each vehicle visits no more than two customers. Among all such feasible 
solutions, the heuristic finds the one with minimum cost. This can be done by 
formulating the following integer linear program. 

For every x k ,xi £ N, let 

! dk + dki + di, if k ± l and w k + Wi < 1; 

2d&, if k — l\ 

oo, otherwise. 

The integer program to solve is 

Problem P : Min E CklXkl 

k<l 

S.t. 

E*h + E*«* = 1, Vfc=l,2,...,n (14.6) 

l>k l<k 

v fci e{0,l}, Vfc < l. (14.7) 

For k < l, X k i is 1 if a vehicle delivers items to customers x k and xi and is 0 
otherwise. Constraint (14.6) ensures that each customer is visited. 

It is not hard to see that P can be solved in polynomial time since it is no more 
than a classical weighted matching problem defined on a specific graph. Define the 
following graph G = ( N,E ), where each customer x k is represented by two nodes 
Vk and v' k , for k = 1,2 The set of edges of G is defined as follows. 

E ={(v k ,v'k)\x k £ N} 

U {(v k ,Vi)\x k £ N,xi £ N,k ^ l, Wk + wi < 1} 

^ {Wki v i)\ x k £ N,xi £ N,k ^l,Wk+wi< 1}. 



Thus, G has 2 n vertices. The length of edge (i ’ k ,vi), for k ^ l, is c k i, of edge 
(■ Vk,v' k ) is Ckk and of edge (■ v' k ,v[ ) is 0, for all k and l. 

Note that any given feasible solution to P can be transformed into a feasible 
solution to the matching problem on G with the same cost. For any feasible solution 
to P, choose edge (vk,v' k ) if customer k is served by a vehicle that does not serve 
any other customer and choose edges (v k ,vi) and (v k , v[) if customers x k and xi 
are visited together. Similarly, any feasible solution to the matching problem can 
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be transformed into a feasible solution to P with the same cost. Hence, the two 
problems are equivalent. 

An optimal matching in G can be found in 0(n 3 ) using Lawler’s (1976) algo- 
rithm. 

The main result of this section is the following. 

Theorem 14.6.1 Let x k, k = 1,2 , ...,n be a sequence of independent random 
variables having a distribution p with compact support, in 1R 2 . Let 



E(d)= [ \\x\\dp(x). 

J we 2 

Let the demands Wk, k = 1,2 ,...,n be a sequence of independent random vari- 
ables having a uniform distribution on [0, 1] and assume that the demands and the 
location of the customers are independent of each other. Then, the OMP heuristic 
is asymptotically optimal. That is, with probability one, 

y* yOMP 

lim = lim = E{d). 

n—> oo 77, n — »oo 77 



To prove that the OMP heuristic is asymptotically optimal, we approximate its 
performance by that of the Sliced Region Partitioning heuristic with parameters 
h and r ( SRP{h,r )). For any fixed positive integer r > 1, the set N is partitioned 
into the following 2 r disjoint subsets, some of which may be empty. 



Nj 

and 

N j 

Also 

and 



{ Xk€ <Wk ~ K 1_ r)l i = — 1, 



1 / 1 r 



{ Xk e + 'h^) <Wk ~l ( 1 + r)} j = W,---,r-l. 

N 0 = ja: fe € N ^( 1 “ 

N r = |a; fe e N < u ’kY 



The number of customers in each Nj (respectively, N J ) is denoted by Uj (respec- 
tively, n J ) for all possible values of j. 

Note that for any j = 1, 2, . . . , r — 1, one vehicle can deliver the demand of a 
customer from Nj together with the demand of exactly one customer from N 3 . The 
SRP{h , r) heuristic generates pairs of customers, one customer from Nj and one 
from N 3 , for every j = 1, 2, . . . , r — 1, using the same region partitioning scheme 
used in the proof of Theorem 14.4.1 (Section 14.4.2). The customers in Nq U N r 
are served separately; a single vehicle is assigned to each of these customers. 

For every subregion A.-,, i= 1,2,..., t(h), generated by the grid G(h) (see Section 
14.4.2) and for every j = 1,2, ... ,7 — 1, let Nj(i) (respectively, N 3 (i)) be the subset 
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of points in Nj (respectively, N 3 ) that fall in subregion A*. Also, let rij(i) = |iVj(i)| 
and n 3 (i) = |1\U(*)I- 

In each subregion A,, i = 1,2 and for any j = 1 , 2 , ...,r — 1 , we 

arbitrarily match one customer from Nj(i) with exactly one customer from N 3 (i); 
one vehicle serves each such pair. If = n 3 (i), then all customers in Nj(i) U 
N 3 (i) are matched and therefore visited in pairs. If, however, nj{i) ^ n 3 (i), then 
we can match exactly min {nj(i), n 3 (z)} pairs of customers. The remaining \rij (i) — 
n 3 ( i ) | customers in Nj ( i ) U N 3 ( i ) that have not yet been matched are each served 
by one vehicle. Thus the total number of vehicles used in subregion A,; is 

r— 1 

no(i) + n r (i ) + ^max{ nj(i),n 3 (i)}. 
j = i 



The heuristic clearly generates a feasible solution to the UCVRP. Moreover, this 
solution is feasible for P, as each vehicle visits at most two customers. Thus, 

z OMP < z SRP(h,r) for any r > 1 and h > 0. 

We now proceed by finding an upper bound on z SRP< - h ’ r \ Essentially the same 
analysis as in Section 14.4.2 shows that the total distance traveled by all vehicles 
is no more than 



t(h ) r— 1 

2^^d(i) no(i) + n r (i ) +^max{ rij(i),n 3 (i)} 

i=l j=l 



2 nh. 



Since 



lim ”' (i > 



.. n 3 (i) 1 

= Inn = — (a.s.) 



n(i ) — >oo u(f) n (d — u(i) 2r 

we have 

r— 1 



for all j = l,2,...,r, 



lim 7V\ , 

n(i )—> oo Tlyl) L 



no(i) + n r (i) + ma x{rij(i),n 3 (i)} 

j = i 



1 1 
2 + 2 r 



(a.s.). 



The remainder of the proof is identical to the proof of the upper bound of Theorem 
14.4.1. 

Therefore, the OMP is asymptotically optimal when demands are uniformly 
distributed between 0 and 1. In fact, the proof can be extended to a larger class of 
demand distributions. For example, for any demand distribution with symmetric 
density, one with f(x) = /(I — x) for x € [0,1], one can show that the same result 
holds. 



14.7 The Location-Based Heuristic 

Recently, Bramel and Simchi-Levi (1995) used the insight obtained from the anal- 
ysis of the asymptotic optimal solution value (see Theorem 14.4.1 above and the 
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discussion that follows it) to develop a new and effective class of heuristics for the 
UCVRP called Location-Based Heuristics. Specifically, this class of heuristics was 
motivated by the following observations. 

A byproduct of the proof of Theorem 14.4.1 is that the region partitioning 
scheme used to find an upper bound on Z* is asymptotically optimal. Unfortu- 
nately, the scheme is not polynomial since it requires, among other things, opti- 
mally solving the Bin-Packing Problem. But, the scheme suggests that, asymptot- 
ically, the tours in an optimal solution will be of a very simple structure consisting 
of two parts. The first is the round trip the vehicle makes from the depot to the 
subregion (where the customers are located); we call these the simple tours. The 
second is the additional distance (we call this insertion cost) accrued by visiting 
each of the customers it serves in the subregion. Our goal is therefore to construct 
a heuristic that assigns customers to vehicles so as to minimize the sum of the 
length of all simple tours plus the total insertion costs of customers into each 
simple tour. If done carefully, the solution obtained is asymptotically optimal. 

To construct such a heuristic we formulate the routing problem as another com- 
binatorial problem commonly called (see, e.g., Pirkul (1987)) the single-source 
Capacitated Facility Location Problem (CFLP). This problem can be described 
as follows: given m possible sites for facilities of fixed capacity Q , we would like to 
locate facilities at a subset of these m sites and assign n retailers, where retailer 
i demands Wi units of a facility’s capacity, in such a way that each retailer is as- 
signed to exactly one facility, the facility capacities are not exceeded and the total 
cost is minimized. A site-dependent cost is incurred for locating each facility; that 
is, if a facility is located at site j, the set-up cost is Vj , for j = 1,2,..., rn. The cost 
of assigning retailer i to facility j is c t j (the assignment cost), for i = 1,2, ... ,n 
and j = 1, 2, . . . , m. 

The single-source CFLP can be formulated as the following integer linear pro- 
gram. Let 



Vj = 



1, if a facility is located at site j, 
0, otherwise, 



and let 



Xij — 



1, if retailer i is assigned to a facility at site j, 
0, otherwise. 



n m m 

Problem CFLP : Min EE C-ijXij + E v jUj 

*= 1 j = 1 j = 1 



m 



s.t. 


El x ij 1 > 

i=i 


Vi 


(14.8) 




n 

^ ^ WjXjj E Q, 
i=l 


Vj 


(14.9) 




Xij E Vj i 


Vi, j 


(14.10) 
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XijG{ 0,1}, Vi,j (14.11) 

yjG {0,1}, Vj. (14.12) 

Constraints (14.8) ensure that each retailer is assigned to exactly one facility, 
and constraints (14.9) ensure that the facility’s capacity constraint is not violated. 
Constraints (14.10) guarantee that if a retailer is assigned to site j, then a facility 
is located at that site. Constraints (14.11) and (14.12) ensure the integrality of the 
variables. 

In formulating the UCVRP as an instance of the CFLP, we set every customer 
Xj in the UCVRP as a possible facility site in the location problem. The length of 
the simple tour that starts at the depot visits customer Xj and then goes back to 
the depot is the set-up cost in the location problem (i.e., Vj = 2 dj). Finally, the 
cost of inserting a customer into a simple tour in the UCVRP is the assignment 
cost in the location problem (i.e., = di + d^ — dj). This cost should represent 

the added cost of inserting customer i into a simple tour through the depot and 
customer j. Consequently, when i is added to a tour with j, the added cost is 
Cij = di + dij — dj , so that Vj + = di + dij + dj . However, when a third customer 

is added, the calculation is not so simple, and therefore the values of c,j should in 
fact represent an approximation to the cost of adding i to a tour that goes through 
customer j and the depot. Hence, finding a solution for the CVRP is obtained by 
solving the CFLP with the data as described above. The solution obtained from 
the CFLP is transformed (in an obvious way) to a solution to the CVRP. 

Although A^P-Hard , the CFLP can efficiently, but approximately, be solved 
by the familiar Lagrangian relaxation technique (see Chapter 12), as described 
in Pirkul or Bramel and Simchi-Levi (1995) or by a cutting-plane algorithm, as 
described in Deng and Simchi-Levi (1992). 

We can now describe the Location-Based Heuristic (LBH): 

The Location-Based Heuristic 

Step 1: Formulate the UCVRP as an instance of the CFLP. 

Step 2: Solve the CFLP. 

Step 3: Transform the solution obtained in Step 2 into a solution for the UCVRP. 



Variations of the LBH can also be applied to other problems; we discuss this 
and related issues in the next chapter where we consider a more general vehicle 
routing problem. 

The LBH algorithm was tested on a set of 11 standard test problems taken from 
the literature. The problems are in the Euclidean plane and they vary in size from 
15 to 199 customers. The performance of the algorithm on these test problems 
was found to be comparable to the performance of most published heuristics. This 
includes both the running time of the algorithm as well as the quality (value) of 
the solutions found; see Bramel and Simchi-Levi (1995) for a detailed discussion. 
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One way to explain the excellent performance of the LBH is by analyzing its 
average performance. Indeed, a proof similar to the proof of Theorem 14.4.1 reveals 
(see also Bramel and Simchi-Levi (1995)) that, 

Theorem 14.7.1 Under the assumptions of Theorem lf.f.l, there are versions 
of the LBH that are asymptotically optimal; that is, 

lim —Z LBH = 2y E(d) ( a.s ). 

n — >-oo 77, 

Finally, we observe that the Generalized Assignment Heuristic due to Fisher 
and Jaikumar (1981) can be viewed as a special case of the LBH in which the seed 
customers are selected by a dispatcher. In the second step, customers are assigned 
to the seeds in an efficient way by solving a generalized assignment problem. The 
advantage of the LBH is that the selection of the seeds and the assignment of 
customers to seeds are done simultaneously, and not sequentially as in the Gen- 
eralized Assignment Heuristic. Note that neither of these heuristics (the LBH or 
the Generalized Assignment Heuristic) requires that potential seed points be cus- 
tomer locations; both can be easily implemented to start with seed points that are 
simply points on the plane. A byproduct of the analysis, therefore, is that when 
the Generalized Assignment Heuristic is carefully implemented (i.e., “good” seeds 
are selected), it is asymptotically optimal as well. 



14.8 Rate of Convergence to the Asymptotic Value 

While the results in the two previous sections completely characterize the asymp- 
totic optimal solution value of the UCVRP, they do not say anything about the 
rate of convergence to the asymptotic solution value. See Psaraftis (1984) for an 
informal discussion of this issue. 

To get some intuition on the rate of convergence, it is interesting to determine the 
expected difference between the optimal solution for a given number of customers 
n, and the asymptotic solution value (i.e., 2 'yE[d]). This can be done for the 
uniform model discussed in Section 14.6. 

In this case, Bramel et al. (1991) and, independently, Rhee (1991) proved the 
following strong result. 

Theorem 14.8.1 Let Xk k = 1,2 ,...,n be a sequence of independent random 
variables uniformly distributed in the unit square [0, l] 2 . Let the demands Wk, k = 
1,2, ... ,n be drawn independently from a uniform distribution on (0, 1] . Then 

E[Z*\ = nE[d] + 0(n 2/3 ). 

The proof of Theorem 14.8.1 relies heavily on the theory of three-dimensional 
stochastic matching which is outside the scope of our survey. We refer the reader 
to Coffman and Lueker (1991, Chapter 3) for an excellent review of matching 
problems. 
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Rhee has also found an upper bound on the rate of convergence to the asymp- 
totic solution value, for general distribution of the customers’ locations and their 
demands. Using a new matching theorem developed together with Talagrand, she 
proved: 

Theorem 14.8.2 Under the assumptions of Theorem we have 

2n^E[cC\ < E[Z*] < 2my E[d] + 0((nlogn) 2 ^ 3 ). 

14.9 Exercises 



Exercise 14.1. Consider the following heuristic for the CVRP with unequal de- 
mands. All customers of demand Wi > | are served individually, one customer per 
vehicle. To serve the rest, apply the UITP heuristic with vehicle capacity Q. Prove 
that this solution can be transformed into a feasible solution to the CVRP with 
unequal demands. What is the worst-case bound of this heuristic? 

Exercise 14.2. Prove Corollary 14.5.3. 

Exercise 14.3. Given a seed point i, assume you must estimate the cost of the 
optimal traveling salesman tour through a set of points SU{?'} using the following 
cost approximation. Starting with 2 di, when each point j is added to the tour, add 
the cost Cij = dj + dij — di . That is, show that for any r > 1 there is an example 
where the approximation is r times the optimal cost. 

Exercise 14.4. Construct an example of the single-source CFLP where each fa- 
cility is a potential site (and vice versa) in which an optimal solution chooses a 
facility but the demand of that facility is assigned to another chosen site. 

Exercise 14.5. Show that Lemma 14.4.5 can be replaced by an equality instead 
of an inequality. 

Exercise 14.6. Prove that the version of the LBH with set-up costs Vj = 2 dj and 
assignment costs c^ = di + dij — dj is asymptotically optimal. 

Exercise 14.7. Explain why the following constraints can or cannot be integrated 
into the Savings Algorithm. 

(a) Distance constraint. Each route must be at most A miles long. 

( b ) Minimum route size. Each route must pick up at least m points. 

(c) Mixing constraints. Even indexed points cannot be on the same route as odd 
indexed points. 
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Exercise 14.8. Consider an instance of the CVRP with n customers. A customer 
is red with probability p and blue with probability 1 — p, for some p £ [0,1]. Red 
customers have loads of size |, while blue customers have loads of size |. What is 
hin^^oo A as a function of pi 
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The VRP with Time Window Constraints 



15.1 Introduction 

In many distribution systems each customer specifies, in addition to the load that 
has to be delivered to it, a period of time, called a time window , in which this 
delivery must occur. The objective is to find a set of routes for the vehicles, where 
each route begins and ends at the depot, serves a subset of the customers without 
violating the vehicle capacity and time window constraints, while minimizing the 
total length of the routes. We call this model the Vehicle Routing Problem with 
Time Windows (VRPTW). 

Due to the wide applicability and the economic importance of the problem, 
variants of it have been extensively studied in the vehicle routing literature; for a 
review see Solomon and Desrosiers (1988). Most of the work on the problem has 
focused on an empirical analysis while very few papers have studied the problem 
from an analytical point of view. This is done in an attempt to characterize the 
theoretical behavior of heuristics and to use the insights obtained to construct 
effective algorithms. Some exceptions are the recent works of Federgruen and van 
R.yzin (1992) and Bramel and Simchi-Levi (1996). Below we describe the results 
of the latter paper. 



15.2 The Model 

To formally describe the model we analyze here, let the index set of the n customers 
be denoted N = {1,2, .. . , n}. Let Xk € M 2 be the location of customer k € N. 
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Assume, without loss of generality, that the depot is at the origin and, by rescaling, 
that the vehicle capacity is 1 and that the length of the working day is 1. We 
assume vehicles can leave and return to the depot at any time. Associated with 
customer k is a quadruplet (wk, e-k, Sk, Ik), called the customer parameters, which 
represents, respectively, the load that must be picked up, the earliest starting time 
for service, the time required to complete the service, called the service time, and 
the latest time service can end. Clearly, feasibility requires that ek + Sk < Ik and 
Wk, ek, h G [0, 1], for each k € TV. 

For any point x € TR 2 , let ||ai|| denote the Euclidean distance between x and the 
depot. Let dk = ||zfc|| be the distance between customer k and the depot. Also, let 
djk = || Xj — x k \\ be the distance between customer j and customer k. Let Z% be 
the total distance traveled in an optimal solution to the VRPTW, and let Z t H be 
the total distance traveled in the solution provided by a heuristic H. 

Consider the customer locations to be distributed according to a distribution p 
with compact support in IR 2 . Let the customer parameters {( Wk,ek,Sk,h ) : k £ 
TV} be drawn from a joint distribution $ with a continuous density (j). Let C be the 
support of </>; that is, C is a subset of {( 01 , 02 , 03 , 04 ) £ [0, l ] 4 : 02 + 03 < 04 }. Each 
customer is therefore represented by its location in the Euclidean plane along with 
a point in C. Finally, we assume that a customer’s location and its parameters are 
independent of each other. 

In our analysis we associate a job with each customer. The parameters of job 
k are the parameters of customer k, that is, (wk,ek, Sk,lk), where Wk is referred 
to as the load of job k and, using standard scheduling terminology, ek represents 
the earliest time job k can begin processing, Sk represents the processing time 
and Ik denotes the latest time the processing of the job can end. The value of ek 
can be thought of as the release time of job k, that is, the time it is available for 
processing. The value of Ik represents the due date for the job. Each job can be 
viewed abstractly as simply a point in C. Occasionally, we will refer to customers 
and jobs interchangeably; this convenience should cause no confusion. 

To any set of customers T C TV with parameters {(wk,ek, Sk,h) ■ k £ T}, we 
associate a corresponding machine scheduling problem as follows. Consider the set 
of jobs T and an infinite sequence of parallel machines. Job k becomes available 
for processing at time ek and must be finished processing by time Ik- The objec- 
tive in this scheduling problem is to assign each job to a machine such that (i) 
each machine has at most one job being processed on it at a given time, ( ii ) the 
processing time of each job starts no earlier than its release time and ends no later 
than its due date and (in) the total load of all jobs assigned to a machine is no 
more than 1, and the number of machines used is minimized. In our discussion we 
refer to (ii) as the job time window constraint and to (Hi) as the machine load 
constraint. 

Scheduling problems have been widely studied in the operations research liter- 
ature; see Lawler et al. (1993) and Pinedo (1995). Unfortunately, no paper has 
considered the scheduling problem in its general form with the objective function 
of minimizing the number of machines used. 

Observe that in the absence of time window constraints, the scheduling problem 
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is no more than a Bin-Packing Problem. Indeed, in that case the VRPTW reduces 
to the model analyzed in the previous chapter, the CVRP. Thus, our strategy is 
to try to relate the machine scheduling problem to the VRPTW in much the same 
way as we used results obtained for the Bin-Packing Problem in the analysis of 
the CVRP. As we shall shortly see, this is much more complex. 

Let M*(S) be the minimum number of machines needed to schedule a set S of 
jobs. It is clear that this machine scheduling problem possesses the subadditivity 
property, described in Section 4.2. This implies that if M* is the minimum number 
of machines needed to schedule a set of n jobs whose parameters are drawn inde- 
pendently from a distribution <f>, then there exists a constant 7 > 0 (depending 
only on 4>) such that lim^oo M*/n = 7 ( a.s .). 

In this chapter we relate the solution to the VRPTW to the solution to the 
scheduling problem defined by the customer parameters. That is, we show that 
asymptotically the VRPTW is no more difficult to solve than the corresponding 
scheduling problem. Our main result is the following. 

Theorem 15.2.1 Let xi, X 2 , ■ ■ ■ , x n be independently and identically distributed 
according to a distribution fi with compact support in 1R 2 , and define 

E{d) = f \\x\\dn(x). 

JlR 2 

Let the customer parameters {( Wk,ek,Sk,lk ) : k € N} be drawn independently 
from 4). Let M* be the minimum number of machines needed to feasibly schedule 
the n jobs corresponding to these parameters, and linx^oo ^ = 7 (a.s.). Then 

lim — Zj = 27 E(d) (a.s.). 

n — »oo fi 

We prove this theorem (in Section 15.3) by introducing a lower bound on the 
optimal solution value and then developing an upper bound that converges to the 
same value. The lower bound uses a similar technique to the one developed in 
Chapter 14. The upper bound can be viewed as a randomized algorithm that is 
guaranteed to generate a feasible solution to the problem. That is, different runs of 
the algorithm on the same data may generate different feasible solutions. In Section 
15.4, we show that the analysis leads, in a natural way, to the development of a new 
deterministic algorithm which is asymptotically optimal for the VRPTW. Though 
not polynomial, computational evidence shows that the algorithm works very well 
on a set of standard test problems. 



15.3 The Asymptotic Optimal Solution Value 

We start the analysis by introducing a lower bound on the optimal objective 
function value Zf. First, let A be the compact support of /i, and define d max = 
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sup{||a:|| : x € A}. Pick a fixed integer r > 1, and define dj = ( j — 1)^P, for 
j = 1, 2, . . . , r. Now define the sets: 

Fj = jfc e N\ dj < dfcj for j = 1,2, , r. 

For any set T C N, let M* (T) be the minimum number of machines needed to 
feasibly schedule the set of jobs {(wk, ek,Sk,h) ■ k € T}. The next lemma provides 
a deterministic lower bound on Zf and is analogous to Lemma 14.4.3 developed 
for the VRP with capacity constraints. 

Lemma 15.3.1 

Z* > 2 ^L^M*(Fj). 

V 1=2 

Proof. Let V* be the number of vehicles in an optimal solution to the VRPTW 
that serve a customer from Fj, for j = 1, 2, . . . , r. By this definition, V* is exactly 
the number of vehicles whose farthest customer visited is in F r , and V* — V* +l 
is exactly the number of vehicles whose farthest customer visited is in Fj \ Fj + \. 
Observe that if V* = V* +1 , then there are no vehicles whose farthest customer 
visited is in Fj \ Fj + Consequently, 

1 1 

z; > 2 d r v; + Y J ^{y* -v* +1 ) 

i=i 

r 

= 2d 1 V 1 *+'£‘>(dj-d j _ 1 )V* 

1=2 

= 2 fimax VP*. 

j- / -J 3 
3 = 2 

We now claim that for each j = 1,2, ... ,r, V* > M*(Fj). This should be clear 
from the fact that the set of jobs in Fj can be feasibly scheduled on V* machines 
by scheduling the jobs at the times they are served in the VRPTW solution. I 
We can now determine the asymptotic value of this lower bound. This can be 
done in a similar manner to that of Chapter 14, and hence we omit the proof here. 

Lemma 15.3.2 Under the conditions of Theorem 15.2.1 

lim — Zl > 2y E(d) ( a.s .). 

n—> oo Tl 

We prove Theorem 15.2.1 by approximating the optimal cost from above by that 
of the following four-step heuristic. In the first step, we partition the region where 
the customers are distributed into subregions. In the second step, we randomly 
separate the customers of each subregion into two sets. Then for each subregion, 
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we solve a machine scheduling problem defined on the customers in one of these 
sets. Finally, we use this schedule to specify how to serve all the customers in the 
subregion. 

Pick an e > 0, and let 8 be given by the definition of continuity of <f>, that is, 
8 > 0 is such that for all x,y € G with \\x — y\\ < 8, we have \<j>(x) — <t>(y)\ < e. 
Finally, pick a A < min{^,e}. 

Let G( A) be an infinite grid of squares of diagonal A, that is, of side ^=, with 
edges parallel to the system coordinates. Recall that A is the compact support of 
y and let Ai, A 2 , . . . , A t ^ be the subregions of G(A) that intersect A and have 
y(Ai) > 0. 

Let N(i) be the indices of the customers located in subregion Ai, and define 
n(i) = |-/V(z)|. For each customer k € N(i), with parameters {w^, e*,, s^, Ik), we 
associate a job with parameters ( Wk,ek,Sk + A ,l k + A). For any set T C N of 
customers, let M^(T) be the minimum number of machines needed to feasibly 
schedule the set of jobs {{iVk, e*,, Sk + A, Ik + A) : k € Tj. In addition, for any set 
T of customers, let T(i) = N{i) D T, for i = 1, 2 , . . . , t( A). 

For the given grid partition and for any set T C N of customers, the following 
is a feasible way to serve the customers in N. All subregions are served separately; 
that is, no customers from different subregions are served by the same vehicle. 
In subregion A t , we solve the machine scheduling problem defined by the jobs 
{(u>k,ek,Sk + A, Ik + A) : k € T(i)}. Then, for each machine in this scheduling 
solution, we associate a vehicle that serves the customers corresponding to the jobs 
on that machine. The customers are visited in the exact order they are processed 
on the machine, and they are served in exactly the same interval of time as they 
are processed. This is repeated for each machine of the scheduling solution. The 
customers of the set N(i) \ T(i) are served one vehicle per customer. This strategy 
is repeated for every subregion, thus providing a solution to the VRPTW. 

We will show that for a suitable choice of the set T, this routing strategy is 
asymptotically optimal for the VRPTW. An interesting fact about the set T is 
that it is a randomly generated set; that is, each time the algorithm is run it results 
in different sets T. 

The first step is to show that, for any set T C N (possibly empty), the solu- 
tion produced by the above-mentioned strategy provides a feasible solution to the 
VRPTW. This should be clear from the fact that having an extra A units of time 
to travel between customers in a subregion is enough since all subregions have 
diagonal A. Therefore, any sets of customers scheduled on a machine together can 
be served together by one vehicle. Customers of N(i) \ T can clearly be served 
within their time windows since they are served individually, one per vehicle. 

We now proceed to find an upper bound on the value of this solution. For each 
subregion A,, let n ? (i) be the number of jobs on the j th machine in the optimal 
schedule of the jobs in T(i), for each j = 1,2 , . . . , Let d(i) be the 

distance from the depot to the closest customer in N(i), that is, in subregion A,;. 
Using the routing strategy described above, the distance traveled by the vehicle 
serving the customers whose job was assigned to the j th machine of subregion A, 
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is no more than 
Therefore, 



2 d(i) + A (n,j(i) + 1). 



t(A )Mi(T(i)) 

Z t<T, E [2 d(i) + A (nj(i) + 1)1 + ^ 2d fc 

i=l j=l k0T 

t( A) 

< 2^M^{T{i))d{i) + 2nA + ^ 2d fc . 

i=l fc£T 



Dividing by n and taking the limit we have 



i(A) 



1 ' ' 1 1 

lim — iL* < 2 V'' lim —M^(T(i))d(i) + 2A + lim — N 2dk 

i — >00 n < ^ n , — >00 71 n . — >00 7 i • ^ 



i—1 

i(A) 



= 2 lim 

* 17 . — 



k0T 

n (i) M^(T(i)) . _ .. 1 



i=l 

t{ A) 



n—>oo n n 



(0 



-d(i) + 2A + lim — 2 dk 

17 . — KY) 71 ‘ ^ 



k(£T 



— — n(i) - — Ml(T(i)) 
< 2 > lim — ^ lim AV .,. V 

' n— ► oo n n— >oo nil) 

i—1 v ' 



2 A + lim — 2dk- 

17 . — >00 71 • ^ 



(15.1) 



k0T 



In order to relate this quantity to the lower bound of Lemma 15.3.2, we must 
choose the set T appropriately. For this purpose, we make the following obser- 
vation. Recall that (f> is the continuous density associated with the distribution 
$. The customer parameters (wk,ek, Sk,lk) of each of the customers of N are 
drawn randomly from the density <f>. Associated with each customer is a job 
whose parameters are perturbed by A in the third and fourth coordinates, that is, 
(wk, efc, Sfc + A, 1 ^ + A) . This is equivalent to randomly drawing the job parameters 
from a density which we call <fi' . The density 0 1 can be found simply by translating <f> 
by A in the third and fourth coordinates, that is, for each x = ( 9 i, 62, 63, 6*4) € 1 R A , 
<j)'{x) = <f>'{6\, 62, 03, 04) = 0 { 9 \, 02 1 03 — A, 64 — A). Finally, for each 2: G 1 R 4 , define 
ip(x) = min{0(a;), (j>'(x )} and let q = J M4 ip < 1. 

The n jobs (or customer parameters) {yk = ( Wk,ek,Sk + A, If. + A) : k G N} 
are drawn randomly from the density <// and our task is to select the set T C N. 
To simplify presentation, we refer interchangeably to the index set of jobs and to 
the set of jobs itself; that is, k G N will have the same interpretation as yk G N 
where y k = ( w k ,e k ,s k + A ,l k + A). 

For each job yk, generate a random value, call it Uk, uniformly in [0, <t>'(yk)\ - The 
point ( yk, Uk ) G -R 5 is a point below the graph of (j>'\ that is, Uk < 4 >'{yk). Define 
T as the set of indices of jobs whose Uk value falls below the graph of </>; that is, 
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T = {k £ N : Uk < (j>{Vk)}- Then the set of jobs {yu '■ k £ T} can be viewed as a 
random sample of |T| jobs drawn randomly from the density ^ . 

In order to relate this upper bound to the lower bound we need to present the 
following lemma. 



Lemma 15.3.3 ForT generated as above and for each subregion Ai, i = 1,2, ... ,t( A) , 

( a-s .). 



MUTli )) 

lim ” < 7, 



i(*) 

Proof. To prove the result for a given subregion Ai, we construct a feasible schedule 
for the set of jobs {yk = (wk, e*,, Sfc + A, Ik + A) : k £ T(i)j. Generate n(z) — |T(z)| 
jobs randomly from the density 

1 -q 

Call this set of jobs D, for dummy jobs. From the construction of the sets D and 
T(i), it is a simple exercise to show that the parameters of the jobs in D U T(i) 
are distributed like (j>. 

A feasible schedule of the jobs in T(i) is obtained by optimally scheduling the 
jobs in D U T(i) using, say M* machines. The number of machines needed to 
schedule the jobs in T(i) is obviously no more than M t , since the jobs in D can 
simply be ignored. Thus we have the bound 

Now dividing by n(i) and taking the limits, we get 

— M* A (T(i)) ^ — Mi , , 

lim < hm — r— =7, (a.s.j, 

n —> oo n(l) ra — > oo Tlyl) 

since the set of jobs D\JT(i) is just a set of n(i) jobs whose parameters are drawn 
independently from the density 4>. I 

Lemma 15.3.3 thus reduces equation (15.1) to 



t(A) 



lim — < 2V 7 lim — —d(i) + 2A + lim — 

1 — >fY) 71 » ^ fl — >00 fl 71. — >fY) 71 • ^ 



n — >-oo fl 



i—1 



t( A) 



n — xx> n 



2 dk 



k<£T 



= 27 lim — n(i)d(i) + 2A + lim — 2 dk 

71 . — >00 71 ' J 71 , — >00 71 ' ^ 



i = 1 



k£T 



< 27 lim — dk + 2A + lim — y^ 2 dk 

71 , — >00 71 • ^ 71 , — >00 71 • ^ 



keN 



k£T 



= 27 E{d) + 2 A + lim — 2 dk 

n — >00 ft ' 



k£T 

< 27 E(d) + 2A + 2d max lim — \N \ T\ . 

n — >00 Tl 
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The next lemma determines an upper bound on linin^oo ^\N \ T\. 
Lemma 15.3.4 Given e > 0 and T generated as above, 

lim — | N \ T\ < (1 + e) 2 e ( a.s .). 

n—* oo 77 , 



Proof. By the Strong Law of Large Numbers, the limit is equal to the probability 
that a job of N is not in the set T. The probability of a particular job not being 
in T is simply 

r x<nvk)><Kvk), 

0, otherwise. 



Hence, almost surely 

lim -\N\T\= [ max { ~ f ^ , 0 \<j>'(x)di 

oo n Jjr 4 l \ x ) ' 



< 



J ]R A 



<t>'(x) - <j>(x) 
</>'( x) 



1 1 R 4 



4> (x)dx 
\<j)' (x) — <j>(x)\dx 



/iR 4 



\4> {0l, @2 , 0 3 , 64) — 0(6*1 , 02 , 03 1 0i)\d(9\, 02, 03, 04) 



11 R 4 



1 0 ( 6 * 1 , 02, 03 — A, 04 — A) — 0 ( 01 , 02 , 03 , 04)|d(01, 02, 03 , 04 ) 



< (1 + A) 2 

< (1 + e ) 2( 



where the second to last inequality follows from || (0 1; 0 2 , 03— A, 04— A) — (0i, 0 2 , 03, 0i)|| 
A\/2 < 6 and the continuity of <p. I 

We now have all the necessary ingredients to finish the proof of Theorem 15.2.1; 
thus 

lim — Z* < 2y E(d) + 2d max (l + e) 2 e + 2A (a.s.). 

n—> oo 77, 

Since e was arbitrary and recalling that A < e, we have 

lim — Zf < 2y E(d) (a.s.). 

n — >-oo 77, 



This upper bound combined with the lower bound proves Theorem 15.2.1. 
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15.4 An Asymptotically Optimal Heuristic 

In this section we generalize the LBH heuristic developed for the CVRP (see 
Chapter 14) to handle time window constraints. Similarly to the original LBH 
we prove that the generalized version is asymptotically optimal for the VRPTW. 
We refer to this more general version of the heuristic also as the Location-Based 
Heuristic; this should cause no confusion. 



15.4.1 The Location- Based Heuristic 

The LBH can be viewed as a three-step algorithm. In the first step, the parameters 
of the VRPTW are transformed into data for a location problem called the Ca- 
pacitated Vehicle Location Problem with Time Windows (CVLPTW), described 
below. This location problem is solved in the second step. In the final step, we 
transform the solution to the CVLPTW into a feasible solution to the VRPTW. 

The Capacitated Vehicle Location Problem with Time Windows 

The Capacitated Vehicle Location Problem with Time Windows (CVLPTW) is 
a generalization of the single-source Capacitated Facility Location Problem (see 
Section 14.7) and can be described as follows: we are given m possible sites to 
locate vehicles of capacity Q. There are n customers geographically dispersed in 
a given region, where customer i has Wj units of product that must be picked up 
by a vehicle. The pickup of customer i takes S; units of time and must occur in 
the time window between times e* and that is, the service of customer i can 
start at any time t £ [ et , — s*]. The objective is to select a subset of the possible 

sites, to locate one vehicle at each site, and to assign the customers to the vehicles. 
Each vehicle must leave its site, pick up the load of customers assigned to it in 
such a way that the vehicle capacity is not exceeded and all pickups occur within 
the customer’s time window, and then return to its site. The costs are as follows: 
a site-dependent cost is incurred for locating each vehicle; that is, if a vehicle is 
located at site j, the set-up cost is Vj, for j = 1,2,..., m. The cost of assigning 
customer i to the vehicle at site j is c t j (the assignment cost), for i = 1,2, ... ,n 
and j = 1,2,..., to. We assume that there are enough vehicles and sites so that a 
feasible solution exists. 

The CVLPTW can be formulated as the following mathematical program. Let 

{ 1, if a vehicle is located at site j, 

0, otherwise, 

and let 

{ 1, if customer i is assigned to the vehicle at site j, 

0, otherwise. 

For any set S C V, let fj(S) = 1 if the set of customers S can be feasibly served 
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in their time windows by one vehicle that starts and ends at site j (disregarding 
the capacity constraint), and 0 otherwise. 



n m m 

Problem P : Ad in EE Cij %ij H - E v jVj 

i—1 j—l j = 1 



m 



El X ij ~ 1 > 

i=l 


Mi 


(15.2) 


n 

^ ^ WjXjj E 
£=1 


Vj 


(15.3) 


%ij — Vj i 


Vi, j 


(15.4) 


fj({i : x ij = !}) = 


Vj 


(15.5) 




Vi, j. 


(15.6) 



Constraints (15.2) ensure that each customer is assigned to exactly one vehicle, 
and constraints (15.3) ensure that the vehicle’s capacity constraint is not violated. 
Constraints (15.4) guarantee that if a customer is assigned to the vehicle at site 
j, then a vehicle is located at that site. Constraints (15.5) ensure that the time 
window constraints are not violated. Constraints (15.6) ensure the integrality of 
the variables. 

The Heuristic 

To relate the CVLPTW to the VRPTW, consider each customer in the VRPTW 
to be a potential site for a vehicle; that is, the set of potential sites is exactly the set 
of customers, and therefore to = n. Picking a subset of the sites in the CVLPTW 
corresponds to picking a subset of the customers in the VRPTW; we call this set 
of selected customers the seed customers. These customers are those that will form 
simple tours with the depot. 

In order for the LBH to perform well, the costs of the CVLPTW should approx- 
imate the costs of the VRPTW. The set-up cost for locating a vehicle at site j (vj) 
or, in other words, of picking customer j as a seed customer, should be the cost 
of sending a vehicle from the depot to customer j and back (i.e. , the length of the 
simple tour). Hence, we set Vj = 2 dj for each j £ N. The assignment cost Cij is 
the cost of assigning customer i to the vehicle at site j. Therefore, this cost should 
represent the added cost of inserting customer i into the simple tour through the 
depot and customer j. Consequently, when i is added to a tour with j, the added 
cost is Cij = di + dij — dj , so that Vj + = dt + dij + dj . This cost is exact for two 

and sometimes three customers. However, as the number of customers increases, 
the values of in fact represent an approximation to the cost of adding I to a 
tour that goes through customer j and the depot. In Section 15.4.3 we present 
values of that we have found to work well in practice. 

Once these costs are determined the second step of the LBH consists of solving 
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CVLPTW. The solution provided is a set of sites (seed customers) and a set of 
customers assigned to each of these sites (to each seed) . This solution can then be 
easily transformed into a solution to the VRPTW, since a set of customers that 
can be feasibly served starting from site j can also be feasibly served starting from 
the depot. 

15. 4-2 A Solution Method for CVLPTW 

The computational efficiency of the LBH depends on the efficiency with which 
CVLPTW can be solved. We therefore present a method to solve the CVLPTW. 
As discussed earlier, the CVLPTW without constraints (15.5) is simply the single- 
source Capacitated Facility Location Problem (CFLP) for which efficient solution 
methods exist based on the celebrated Lagrangian relaxation technique; see Section 
5.3. For the CVLPTW, we use a similar method, although the specifics are more 
complex in view of the existence of these time window constraints. 

In this case, for a given multiplier vector A G ]R n , constraints (15.2) are re- 
laxed and put into the objective function with the multiplier vector. The resulting 
problem can be separated into n subproblems (one for each of the n sites), since 
constraints (15.2) are the only constraints that relate the sites to one another. The 
subproblem for site j is: 



y ' c ij x ij + v jUj 

i—1 
n 

^ ) IViXij ^ Cf 
i=l 

x ij A Uj 7 77 

fj({i ■ x ij = !}) = 1 
Xij G {0, 1} Vi and yj G {0, 1}, 
where Cq A dj + A j, for each i G N. 

In the optimal solution to problem Pj , yj is either 0 or 1. If yj = 0, then Xjj = 0 
for all i G IV, and the objective function value is 0. If yj = 1, then the problem 
reduces to a different, but simpler, routing problem. Consider a vehicle of capacity 
Q initially located at site j. The driver gets a profit of p l: j = —Cij for picking up 
the Wi items at customer i in the time window (e.j, If). The pickup operation takes 
Si units of time. The objective is to choose a subset of the customers, to pick up 
their loads in their time windows, without violating the capacity constraint, using 
a vehicle which must begin and end at site j, while maximizing the driver’s profit. 
Let G* be the maximum profit attainable at site j; that is, G* is the optimal 
solution to the problem just described for site j. This implies that Vj — G* is the 
optimal solution value of Problem Pj given that yj = 1. Therefore, we can write 
the optimal solution to Problem Pj as simply min(0, Vj — G*}. 



Problem Pj : Min 
s.t. 
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Unfortunately, in general, determining the values G* for j £ N is MV - Hard 
. We can, however, determine upper bounds on G*; call them Gj. This pro- 
vides a lower bound on the optimal solution to problem Pj which is equal to 
iriin{0, Vj — Gj}. We use the simple bound given by Gj = > 0 } Pfr Conse- 

quently, Xq=i cnin{0, Vj — Gj} — ^" =1 K is a lower bound on the optimal solution 
to the CVLPTW. 

To generate a feasible solution to the VRPTW at each iteration of the procedure, 
we use information from the upper bounds on profit Gj for j £ N. After every 
iteration of the lower bound (for each multiplier) we renumber the sites so that 
Gi > G 2 > • • • > G„. The upper bounds on profit are used as an estimate of 
the profitability of placing a vehicle at a particular site. For example, site 1 is 
considered to be a “good” site (or seed customer), since a large profit is possible 
there. A large profit for site j corresponds to a seed customer where neighboring 
customers can be feasibly served from it at low cost. Therefore, a site with large 
profit is selected as a seed customer since it will tend to have neighboring customers 
around it that can be feasibly served by a vehicle starting at that site. 

To generate a feasible solution to CVLPTW, we do the following: starting with 
j = 1 in the new ordering of the sites (customers), we locate a vehicle at site j. 
For every customer still not assigned to a site, we first determine if this customer 
can be feasibly served with the customers that are currently assigned to site j. 
Then, of the customers that can be served from this site, we determine the one 
that will cause the least increase in cost, that is, the one with minimum Cjj over 
all customers i that can be served from this site. We then assign this customer 
to the site. We continue until no more customers can be assigned to site j, due 
to capacity or time constraints. We then increment j to 2 and continue with site 
2. After all customers have been feasibly assigned to a site, we obtain a feasible 
solution whose cost is compared to the cost of the current best solution. 

As we find solutions to the CVLPTW, we also generate feasible solutions to the 
VRPTW, using the information from the lower bound to CVLPTW. Starting with 
j = 1, pick customer j as a seed customer. Then, for every customer that can be 
feasibly served with this seed, we determine the added distance this would entail; 
that is, we determine the best place to insert the customer into the current tour 
through the customers assigned to seed j. We choose the customer that causes the 
least increase in distance traveled as the one to assign to seed j. This idea is similar 
to the Nearest Insertion heuristic discussed in Section 3.3.2. We then continue 
trying to add customers in this way to seed j. Once no more can be added to this 
tour (due to capacity or time constraints), we increment j to 2, select seed customer 
2 and continue. Once every customer appears in a tour, that is, every customer is 
assigned to a seed, we have a feasible solution to the VRPTW corresponding to 
the current set of multipliers. The cost of this solution is compared to the cost of 
the current best solution. 

Multipliers are updated using (5.6). The step size is initially set to 2 and halved 
after the lower bound has not improved in a series of 30 iterations. After the step 
size has reached a preset minimum (0.05), the heuristic is terminated. 
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15-4-3 Implementation 

It is clear that many possible variations of the LBH can be implemented depending 
on the type of assignment costs (cy) used. In the computational results discussed 
below, the following have been implemented. 

direct cost : Cij = 2 dij, and 

nearest insertion cost : = di + d^ — dj . 

Direct cost C{ 3 has the advantage that, when several customers are added to the 
seed, the resulting cost, which is the sum of the set-up costs and these direct costs, 
is an upper bound on the length of any efficient route through the customers. On 
the other hand, the nearest insertion cost works well because it is accurate at least 
for tours through two customers, and often for tours through three customers as 
well. 

Several versions of the LBH have been implemented and tested. In the first, the 
Star-Tours (ST) heuristic, the direct assignment cost is used, while in the second, 
the Seed-Insertion (SI) heuristic, the nearest insertion assignment cost is applied. 
Observe that the LBH is not a polynomial-time heuristic. However, as we shall 
shortly demonstrate, the running times reported on standard test problems are 
very reasonable and are comparable to the running times of many heuristics for 
the vehicle routing problem. 

The ST heuristic is of particular interest because it is asymptotically optimal as 
demonstrated in the following lemma. The proof is similar to the previous proofs 
and is therefore omitted. 

Lemma 15.4.1 Let n customers, indexed by N, be independently and identically 
distributed according to a distribution p with compact support, in 1R 2 . Define 

E(d)= j \\x\\dp(x). 

Jwt 2 

Let the customer parameters {( Wk,ek,Sk,h . ) : k £ N} be jointly distributed like 
du In addition, let M* be the minimum number of machines needed to feasibly 
schedule the jobs {( u>ki e ki s k,lk ) : k £ N} and let linin^^ M*/n = 7 , (o.s.). 
Then 

lim — Z ST = lim — Zf = 2yE(d) ( a.s .). 

n — »oo 77, n—* 00 n 



15-4-4 Numerical Study 

Tables 1 and 2 summarize the computational experiments with the standard test 
problems of Solomon (1986). The problem set consists of 56 problems of various 
types. All problems consist of 100 customers and one depot, and the distances 
are Euclidean. Problems with the “R” prefix are problems where the customer 
locations are randomly generated according to a uniform distribution. Problems 
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with the “C” prefix are problems where the customer locations are clustered. 
Problems with 



Table 1 



Problem 


Alg. ST 


CPU 

Time 


Alg. SI 


CPU 

Time 


Solomon’s 
Best Solution 


C201 


591.6 


245.9s 


591.6 


260.5s 


591 


C202 


* 652.8 


276.1s 


* 640.8 


262.7s 


731 


C203 


* 692.2 


309.2s 


* 741.1 


308.9s 


786 


C204 


* 721.6 


335.9s 


782.3 


340.6s 


758 


C205 


713.8 


250.8s 


699.9 


258.8s 


606 


C206 


770.8 


257.3s 


* 722.8 


283.3s 


730 


C207 


767.2 


265.7s 


708.9 


275.8s 


680 


C208 


736.2 


287.7s 


660.2 


272.4s 


607 


R201 


*1665.3 


207.1s 


*1533.4 


209.6s 


1741 


R202 


*1485.3 


276.4s 


*1484.3 


248.5s 


1730 


R203 


*1371.5 


406.5s 


*1349.3 


389.0s 


1567 


R204 


1096.7 


532.0s 


1077.0 


538.2s 


1059 


R205 


1472.3 


287.0s 


*1329.4 


312.6s 


1471 


R206 


*1237.0 


412.2s 


*1283.7 


374.2s 


1405 


R207 


*1217.7 


484.8s 


*1162.9 


453.9s 


1241 


R208 


* 966.1 


587.8s 


* 959.9 


612.6s 


1046 


R209 


*1276.1 


394.8s 


*1262.8 


355.7s 


1418 


R210 


*1312.5 


380.7s 


*1340.6 


388.6s 


1425 


R211 


1080.9 


474.7s 


1141.3 


488.7s 


1016 


RC201 


*1873.8 


203.5s 


*1841.7 


185.8s 


1880 


RC202 


*1742.1 


227.8s 


*1705.1 


241.0s 


1799 


RC203 


*1417.5 


331.5s 


*1471.1 


300.1s 


1550 


RC204 


*1139.6 


437.7s 


*1190.3 


411.5s 


1208 


RC205 


*1830.5 


233.0s 


*1878.9 


214.0s 


2080 


RC206 


1640.1 


259.0s 


1607.5 


248.2s 


1582 


RC207 


*1566.4 


294.2s 


*1557.3 


272.3s 


1632 


RC208 


1254.8 


345.7s 


1298.7 


317.3s 


1194 



(* indicates that the LBH improves upon the best solution known.) 



the “RC” prefix are a mixture of both random and clustered. In addition, all the 
problems have a constraint on the latest time Tq at which a vehicle can return to 
the depot. For a full description of these problems we refer the reader to Solomon. 

We compare the performance of the LBH against the heuristics of Solomon and 
the column generation approach of Desrochers et al. (1992). The latter method 
was able to solve effectively 7 of the 56 test problems; we describe this approach 
in the next chapter. 
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To compare the LBH to these solution methods, a time window reduction phase 
was implemented before the start of the heuristic. Here, the earliest time for service 
efc is replaced by max{et, dk}\ in that way, vehicles leave the depot no earlier than 



Table 2 



Problem 


Alg. ST 


CPU 

Time 


Alg. SI 


CPU 

Time 


Solomon’s 
Best Solution 


DDS Solution 
Value 


C101 


828.9 


74.1s 


828.9 


67.0s 


829 


827.3 


C102 


982.8 


82.9s 


1043.4 


73.1s 


968 


827.3 


C103 


*1015.1 


95.9s 


1232.9 


88.4s 


1026 




C104 


* 980.9 


105.4s 


* 976.1 


114.5s 


1053 




C105 


* 828.9 


79.7s 


860.8 


67.3s 


829 




C106 


852.9 


82.8s 


880.1 


66.7s 


834 


827.3 


C107 


828.9 


83.1s 


841.2 


74.7s 


829 


827.3 


C108 


852.9 


88.6s 


853.6 


80.9s 


829 


827.3 


C109 


991.0 


88.6s 


1014.5 


83.1s 


829 




R101 


1983.7 


57.2s 


2071.2 


39.9s 


1873 


1607.7 


R102 


1789.0 


70.8s 


1821.4 


57.4s 


1843 


1434.0 


R103 


1594.5 


88.6s 


1599.1 


67.9s 


1484 




R104 


1242.0 


106.2s 


1237.3 


81.0s 


1188 




R105 


1604.4 


67.0s 


1696.2 


52.0s 


1502 




R106 


1606.9 


78.0s 


1589.2 


70.0s 


1460 




R107 


*1324.9 


92.4s 


1361.2 


70.4s 


1353 




R108 


1202.6 


107.5s 


1205.5 


101.1s 


1134 




R109 


1504.7 


78.5s 


1491.8 


69.6s 


1412 




R110 


1380.9 


92.0s 


1434.4 


69.4s 


1211 




Rill 


1422.1 


91.7s 


1432.4 


69.5s 


1202 




R112 


1248.1 


105.2s 


1284.6 


79.4s 


1086 




RC101 


2045.1 


60.6s 


2014.4 


45.0s 


1867 




RC102 


1806.6 


68.7s 


1969.5 


52.2s 


1760 




RC103 


1708.9 


81.7s 


1716.3 


69.6s 


1641 




R.C104 


1372.1 


93.5s 


1458.8 


79.5s 


1301 




RC105 


*1826.3 


68.9s 


2036.8 


51.3s 


1922 




RC106 


1710.8 


68.0s 


1804.8 


50.5s 


1611 




R.C107 


1593.2 


76.4s 


1630.9 


64.9s 


1385 




RC108 


1421.0 


84.7s 


1493.8 


65.5s 


1253 





(* indicates that the LBH improves upon the best solution known.) 



time 0. In addition, the latest time service can end Ik is replaced by min{7fc, To— dk}- 
The LBH can then be run as it is described in Section 15.4.1. 
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As can be seen in the tables, both the ST and the SI heuristics have been 
implemented. CPU times are in seconds on a Sun SPARC Station II. In Tables 
1 and 2, the column “Solomon’s Best Solution” corresponds to the best solution 
found by Solomon. Solomon tested eight different heuristics on problem sets R1 
and Cl, and six heuristics on problems RCl, R2, C2 and RC2. We see that the 
ST heuristic provides a better solution than Solomon’s heuristics in 25 of the 
56 problems, while the SI heuristic provides a better solution in 21 of the 56 
problems. In Table 2, the column “DDS Solution Value” corresponds to the value 
of the solution found using the column generation approach of Desrochers et al. 



15.5 Exercises 



Exercise 15.1. You are given a network G = (U, A) where \V\ = n, d(i,j) is the 
length of edge (i,j) and a specified vertex a £ V. One service unit is located at 
a and has to visit each vertex in V so that total waiting time of all vertices is 
as small as possible. Assume the waiting time of a vertex is proportional to the 
total distance traveled by the server from a to the vertex. The total waiting time 
(summed up over all customers) is then: 

(n — 1 )d(a, 2) + (n — 2)d(2, 3) + ( n — 3)d(3, 4) + • • • + d{n — 1, n). 

The Delivery Man Problem (DMP) is the problem of determining the tour that 
minimizes the total waiting time. 

Assume that G is a tree with d(i,j) = 1 for every (i, j) £ A. Show that any tour 
that follows a depth-first search starting from a is optimal. 

Exercise 15.2. Consider the Delivery Man Problem described in Exercise 15.1. 
A delivery man currently located at the depot must visit each of n customers. Let 
Z DM be the total waiting time in the optimal delivery man tour through the n 
points. Let Z* be the total time required to travel the optimal traveling salesman 
tour through the n points. 

(a) Prove that 

Z DM < 

( b ) One heuristic proposed for this problem is the Nearest Neighbor (NN) Heuris- 
tic. In this heuristic, the vehicle serves the closest unvisited customer next. Provide 
a family of examples to show that the heuristic does not have a fixed worst-case 
bound. 

Exercise 15.3. Consider the Vehicle Routing Problem with Distance Constraints. 
Formally, a set of customers has to be served by vehicles that are all located at 
a common depot. The customers and the depot are presented as the nodes of 
an undirected graph G = ( N,E ). Each customer has to be visited by a vehicle. 
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The j th vehicle starts from the depot and returns to the depot after visiting a 
subset Nj C N. The total distance traveled by the j th vehicle is denoted by 
Tj. Each vehicle has a distance constraint A: no vehicle can travel more than 
A units of distance (i.e., Tj < A). We assume that the distance matrix satisfies 
the triangle inequality assumption. Also, assume that the length of the optimal 
traveling salesman tour through all the customers and the depot is greater than 
A. 

(а) Suppose the objective function is to minimize the total distance traveled. 
Let K* be the number of vehicles in an optimal solution to this problem. Show 
that there always exists an optimal solution with total distance traveled > \K* A. 
Does this lower bound hold for any optimal solution? 

(б) Consider the following greedy heuristic: start with the optimal traveling 

salesman tour through all the customers and the depot. In an arbitrary orientation 
of this tour, the nodes are numbered •■•,*«) = S in order of appearance, 

where n = the number of customers, i o is the depot and *i, t2, • ■ • ,*n are the 
customers. We break the tour into I\ H segments and connect the end-points of 
each segment to the depot. This is done in the following way. Each vehicle j, 
1 < j < K h starts by traveling from the depot to the first customer i q not visited 
by the previous j — 1 vehicles and then visits the maximum number of customers 
according to S without violating the distance constraint upon returning to the 
depot. 

Show that K h < iriin{n, |~ 1 } where T is the length of the optimal traveling 

salesman tour and d rn is the distance from the depot to the farthest customer. 

Exercise 15.4. Consider the Pickup and Delivery Problem. Here customers are 
pickup customers with probability p and delivery customers with probability 1 —p. 
Assume a vehicle capacity of 1. If customer i is a pickup customer, then a load 
of size Wi < 1 must be picked up at the customer and brought to the depot. If 
customer i is a delivery customer, then a load of size uy < 1 must be brought 
from the depot to the customer. Assume pickup sizes are drawn randomly from a 
distribution with bin-packing constant yp and delivery sizes are drawn randomly 
from a distribution with bin-packing constant yp. A pickup and a delivery can be 
in the vehicle at the same time. 

(a) Develop a heuristic H for this problem and determine as a func- 

tion of p, y p and yp. 

(b) Assume all pickups are of size | and deliveries are of size |. Suggest a 
better heuristic for this case. What is linin^oo — as a function of p for this 
heuristic? 
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Solving the VRP Using a Column 
Generation Approach 



16.1 Introduction 

A classical method, first suggested by Balinski and Quandt (1964), for solving 
the VRP with capacity and time window constraints is based on formulating the 
problem as a set-partitioning problem. (See Chapter 5 for a general discussion of 
set partitioning.) The idea is as follows: let the index set of all feasible routes be 
{1,2,..., R} and let c r be the length of route r. Define 

{ 1, if customer i is served in route r, 

0, otherwise, 

for each customer i = 1,2 , . . . , n and each route r = 1, 2, . . . , R. Also, for every 
r = 1, 2, . . . , R, let 

{ 1, if route r is in the optimal solution, 

0, otherwise. 

In the Set- Partitioning formulation of the VRP, the objective is to select a mini- 
mum cost set of feasible routes such that each customer is included in some route. 
It is: 

R 

Problem S : Min c r y r 

r= 1 
R 

s.t. ^^ai r y r > 1, Vi = 1,2, ...,n (16.1) 

r=l 

Ur G {0, 1}, 

275 



Vr = 1, 2, . . . , R. 
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Observe that we have written constraints (16.1) as inequality constraints instead 
of equality constraints. The formulation with equality constraints is equivalent if we 
assume the distance matrix satisfies the triangle inequality and therefore each 
customer will be visited exactly once in the optimal solution. The formulation with 
inequality constraints will prove to be easier to work with from an implementation 
point of view. 

This formulation was first used successfully by Cullen et al. (1981) to design 
heuristic methods for the VRP. Recently, Desrochers et al. (1992) have used it in 
conjunction with a branch and bound method to generate optimal or near optimal 
solutions to the VRP. Similar methods have been used to solve crew scheduling 
problems, such as Hoffman and Padberg (1993). 

Of course, the set of all feasible routes is extremely large and one cannot expect 
to generate it completely. Even if this set is given, it is not clear how to solve the set- 
partitioning problem since it is a large-scale integer program. Any method based 
on this formulation must overcome these two obstacles. We start, in Section 16.2, 
by showing how the linear relaxation of the set-partitioning problem can be solved 
to optimality without enumerating all possible routes. In Section 16.3, we combine 
this method with a polyhedral approach that generates an optimal or near-optimal 
solution to the VRP. Finally, in Section 16.4, we provide a probabilistic analysis 
that helps explain why a method of this type will be effective. 

To simplify the presentation, we assume no time window constraints exist, the 
extension to the more general model is, for the most part, straightforward. The 
interested reader can find some of these extensions in Desrochers et al. 



16.2 Solving a Relaxation of the Set-Partitioning 
Formulation 

To solve the linear relaxation of Problem S without enumerating all the routes, 
Desrochers et al. use the celebrated column generation technique. A thorough 
explanation of this method is given below, but the general idea is as follows. A 
portion of all possible routes is enumerated, and the resulting linear relaxation 
with this partial route set is solved. The solution to this linear program is then 
used to determine if there are any routes not included that can reduce the objective 
function value. This is the column generation step. Using the values of the optimal 
dual variables (with respect to the partial route set), a new route is generated 
and the linear relaxation is resolved. This is continued until one can show that 
an optimal solution to the linear program is found, one that is optimal for the 
complete route set. 

Specifically, this is done by enumerating a partial set of routes, 1,2,..., R! , and 
formulating the corresponding linear relaxation of the set-partitioning problem 
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with respect to this set: 

R' 

Problem S' : Min c r y r 

r= 1 

S.t. 

R' 

YairUr > 1; Vi = 1, 2, . . . , n (16.2) 

r= 1 

y r >0, Vr = 1, 2, . . . , R' . 

Let y be the optimal solution to Problem S', and let W be the corresponding 
optimal dual variables. We would like to know whether y (or equivalently, 7f) is 
optimal for the linear relaxation of Problem S (respectively, the dual of the linear 
relaxation of Problem S). To answer this question observe that the dual of the 
linear relaxation of Problem S is 

n 

Problem Sd • Max 7Tj 

i= 1 

s.t. 

n 

aj r 7Ti < c r , V/' = 1.2 R (16.3) 

2=1 

7 Ti >0, Vi = 1, 2, ... , n. 

Clearly, if 7f satisfies every constraint (16.3) then it is optimal for Problem Sd 
and therefore y is optimal for the linear programming relaxation of Problem S. 
How can we check whether 7f satisfies every constraint in Problem Sd ? Observe 
that the vector 7f is not feasible in Problem Sd if we can identify a single constraint, 
r, such that 

n 

^ ( Oiir'K i > C r . 

2 = 1 

Consequently, if we can find a column r minimizing the quantity c r — ^2" ai r Wi 
and this quantity is negative, then a violated constraint is found. In that case the 
current vector 7f is not optimal for Problem Sd- The corresponding column just 
found can be added to the formulation of Problem Sp, which is solved again. The 
process repeats itself until no violated constraint (column) is found; in this case we 
have found the optimal solution to the linear relaxation of Problem S (the vector 
y) and the optimal solution to Problem Sd (the vector 7f). 

Our task is then to find a column, or a route, r minimizing the quantity: 




(16.4) 



We can look at this problem in a different way. Suppose we replace each distance 
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dij with a new distance defined by 



d-ij — dij 




Then a tour u\ — » U 2 whose length using {dy } is YlZi d Ui u i+1 + d U(Ul 

has, using {d' ; }, a length 



£-1 

^ 1 dmui+i A d ueu 1 
i=l 



t-1 

^ ( duiUi+i T d U£Ul 
i - 1 



l 

2 — 1 



Hence, finding a route r that minimizes (16.4) is the same as finding a tour of min- 
imum length using the distance matrix {dC} that starts and ends at the depot, 
visits a subset of the customers, and has a total load no more than Q. Unfortu- 
nately, this itself is an A/P-Hard problem and so we are left with a method that 
is not attractive computationally. 

To overcome this difficulty, the set-partitioning formulation, Problem S, is mod- 
ified so as to allow routes visiting the same customer more than once. The purpose 
of this modification will be clear in a moment. This model, call it Problem Sm 
(where M stands for the “modified” formulation), is defined as follows. Enumer- 
ate all feasible routes, satisfying the capacity constraint, that may visit the same 
customer a number of times; each such visit increases the total load by the de- 
mand of that customer. Let the number of routes (columns) be Rm, and let c r be 
the total distance traveled in route r. For each customer i = 1, 2, . . . , n and route 
r = 1 , 2 ,..., R m , let 



= number of times customer i is visited in route r. 

Also, for each r = 1, 2, . . . , R M , define 

{ 1, if route r is in the optimal solution, 

0, otherwise. 

The VRP can be formulated as: 

Rm 

Problem Sm- Min ^c r2 / r 

r=l 

S.t. 

Rm 

Vi = 1,2, ...,n (16.5) 

r= 1 

Vr G {0,1}, Vr = 1, 2, . . . , R m . 

This is the set-partitioning problem solved by Desrochers et al. and therefore it 
is not exactly Problem S. Clearly, the optimal integer solution to Problem Sm 
is the optimal solution to the VRP. However, the optimal solution values of the 
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linear relaxations of Problem Sm and Problem S may be different. Of course, the 
linear relaxation of Problem Sm provides a lower bound on the linear relaxation 
of Problem S. 

To solve the linear relaxation of Problem Sm we use the method described 
above (for solving Problem S). We enumerate a partial set of R' M routes; solve 
Problem S' M which is the linear relaxation of Problem Sm defined only on this 
partial list; use the dual variables to see whether there exists a column not in the 
current partial list with Xo=i > c r . If there exists such a column (s), we add it 
(them) to the formulation and solve the resulting linear program again. Otherwise, 
we have the optimal solution to the linear relaxation of Problem Sm- 

The modification we have made makes the column generation step computa- 
tionally easier. This can now be found in pseudopolynomial time using dynamic 
programming. 

For this purpose, we need the following definitions. Given a path P = {0, ui,U 2 , 

. . . ,U(}, where it is possible that w* = Uj for i ^ j, let the load of this path be 
i w v-i ■ That is, the load of the path is the sum, over all customers in P, of the 
demand of a customer multiplied by the number of times that customer appears 
in P. Let f q (i) be the cost (using {c?G}) of the least cost path that starts at the 
depot and terminates at vertex i with total load q. This can be calculated using 
the recursion 

/<?(*) = min {f q - Wi (j) + d'ij j , (16.6) 

with the initial conditions 

f d' 0i if q = w h 

q \ +oo otherwise. 

Finally, let /°(i) = f q (i) + d' 0i . Thus, f q ( i ) is the length of a least cost tour that 
starts at the depot, visits a subset of the customers, of which customer i is the last 
to be visited, has a total load q and terminates at the depot. Observe that finding 
f q (i) for every q, 1 < q < Q, and every i,i g N, requires 0(n 2 Q) calculations. The 
recursion chooses the predecessor of i to be a node j ^ i. This requires repeat visits 
to the same customer to be separated by at least one visit to another customer. In 
fact, expanding the state space of this recursion can eliminate two-loops: loops of 
the type This forces repeat visits to the same customer to be separated 

by visits to at least two other customers. This can lead to a stronger relaxation 
of the set-partitioning model. For a more detailed discussion of this recursion, see 
Christofides et al. (1981). 

If there exists a q, 1 < q < Q and i, i £ N with f q (i) < 0, then the current 
vectors y and n are not optimal for the linear relaxation of Problem Sm ■ In such 
a case we add the column corresponding to this tour (the one with negative f q (i)) 
to the set of columns in Problem S' M . If, on the other hand, f q (i) > 0 for every q 
and i, then the current y and n are optimal for S M - 

To summarize, the column generation algorithm can be described as follows. 



The Column Generation Procedure 
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Step 1: Generate an initial set of R! M columns. 

Step 2: Solve Problem S' M and find y and tt. 

Step 3: Construct the distance matrix {d' ; } and find ff(q) for all i € N and 1 < 
q<Q. 

Step 4 : For every i and q with ff(q) < 0, add the corresponding column to R' M and 
go to Step 2. 

Step 5: If fi(q) > 0 for all i and q, stop. 



The procedure produces a vector y which is the optimal solution to the linear 
relaxation of Problem Sm ■ This is a lower bound on the optimal solution to the 
VRP. 



16.3 Solving the Set-Partitioning Problem 

In the previous section we introduced an effective method for solving the linear 
relaxation of the set-partitioning formulation of the VRP, Problem Sm ■ How can 
we use this solution to the linear program to find an optimal or near-optimal 
integer solution? 

Starting with the set of columns present at the end of the column generation step 
(the set E ) , one approach to generating an integer solution to the set-partitioning 
formulation is to use the method of branch and bound. This method consists of 
splitting the problem into easier subproblems by fixing the value of a branching 
variable. The variable (in this case a suitable choice is y r for some route r) is either 
set to 1 or 0. Each of these subproblems is solved using the same method; that is, 
another variable is branched. At each step, tests are performed to see if the entire 
branch can be eliminated; that is, no better solution than the one currently known 
can be found in this branch. The solution found by this method will be the best 
integer solution among all the solutions in E. This solution will not necessarily be 
the optimal solution to the VRP, but it may be close. 

Another approach that will generate the same integer solution as the branch 
and bound method is the following. Given a fractional solution to Sm, we can 
generate a set of constraints that will cut off this fractional solution. Then we can 
resolve this linear program and if it is integer, we have found the optimal integer 
solution (among the columns of E). If it is still fractional, then we can continue 
generating constraints and resolving the linear program until an integer solution 
is found. Again, the best integer solution found using this method may be close to 
optimal. This is the method successfully used by Hoffman and Padberg (1993) to 
solve crew-scheduling problems. 

Formally, the method is as follows. 




16.3 Solving the Set-Partitioning Problem 



281 



The Cutting Plane Algorithm 

Step 1: Generate an initial set of R' M columns. 

Step 2: Solve, using column generation, Problem S' M . 

Step 3: If the optimal solution to Problem S' M is integer, stop. 

Else, generate cutting planes separating this solution. 

Add these cutting planes to the linear program S' M . 

Step 4 : Solve the linear program S' M . Goto Step 3. 

To illustrate this constraint generation step (Step 3), we make use of a number of 
observations. First, let E be the set of routes at the end of the column generation 
procedure. Clearly, we can split E into two subsets. One subset E m includes every 
column r for which there is at least one i with £,> > 2; these columns are called 
multiple visit columns. The second subset E s includes the remaining columns; 
these columns are referred to as single visit columns. It is evident that an optimal 
solution to the VRP will use no columns from E m . That is, there always exists a 
single visit column of at most the same cost that can be used instead. We therefore 
can immediately add the following constraint to the linear relaxation of Problem 

Sm ■ 

r = 0. (16.7) 

r£E m 

To generate more constraints, construct the intersection graph G. The graph 
G has a node for each column in E s . Two nodes in G are connected by an edge 
if the corresponding columns have at least one customer in common. Observe 
that a solution to the VRP where no customer is visited more than once can be 
represented by an independent set in this graph. That is, it is a collection of nodes 
on the graph G such that no two nodes are connected by an edge. 

These observations give rise to two inequalities that can be added to the formu- 
lation. 

1. We select a subset of the nodes of G, say K, such that every pair of nodes 
i.j € K are connected by an edge of G. Each set K, called a clique , must 
satisfy the following condition. 

X>r < 1. (16.8) 

r£K 

Clearly, if there is a node j qL K such that j is adjacent to every i £ K, then 
we can replace K with K U {j} in inequality (16.8) to strengthen it (this is 
called lifting). In that sense we would like to use inequality (16.8) when the 
set of nodes K is maximal in that sense. 

2. Define a cycle C = {u\,U2, ■ ■ ■ , tq} in G, such that node Ui is adjacent to 
tq + 1 , for each i = 1,2, ...,£— 1, and node ut is adjacent to node tq. A cycle 
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C is called an odd cycle if the number of nodes in C, \C\ = £, is odd. An 
odd cycle is called an odd hole if there is no arc connecting two nodes of the 
cycle except the £ arcs defining the cycle. It is easy to see that in any optimal 
solution to the VRP each odd hole must satisfy the following property. 

(16.9) 

rGC 

16.3.1 Identifying Violated Clique Constraints 

Hoffman and Padberg suggest several procedures for clique identification, one of 
which is based on the fact that small size problems can be solved quickly by 
enumeration. For this purpose, select v to be the node with minimum degree among 
all nodes of G. Clearly, every clique of G containing v is a subset of the neighbors 
of v, denoted by neigh(v). Thus, starting with v as a temporary clique, that is, 
I\ = {v}, we add an arbitrary node w from neigh[y ) to K. We now delete from 
neigh(v) all nodes that are not connected to a node of A', in this case either v or 
w. Continue adding nodes in this manner from the current set neigh(v ) to K until 
either there is no node in neigh(v) connected to all nodes in K, or neigh(v) = 0. 
In the end, K will be a maximal clique. We can then calculate the weight of this 
clique, that is, the sum of the values (in the linear program) of the columns in the 
clique. If the weight is more than one, then the corresponding clique inequality is 
violated. If not, then we continue the procedure with a new starting node. The 
method can be improved computationally by, for example, always choosing the 
“heaviest” among those nodes eligible to enter the clique. 

16.3.2 Identifying Violated Odd Hole Constraints 

Hoffman and Padberg use the following procedure to identify violated odd hole 
constraints. Suppose y is the current optimal solution to the linear program and 
G is the corresponding intersection graph. Starting from an arbitrary node v £ G, 
construct a layered graph Ge(v) as follows. The node set of Ge(v) is the same as 
the node set of G. Every neighbor of v in G is connected to v by an edge in Gt(v). 
We refer to v as the root, or level 0 node, and we refer to the neighbors of v as level 
1 nodes. Similarly, nodes at level k > 2 are those nodes in G that are connected 
(in G) to a level k — 1 node but are not connected to any node at level < k — 1. 
Finally, each edge (m, uf) in Ge(v) is assigned a length of 1 — y u . — y u . > 0. Now 
pick a node u in Ge(v) at level k > 2 and find the shortest path from u to v in 
Ge(v). Delete all nodes at levels i (1 < i < k) that are either on the shortest path 
or adjacent to nodes along this shortest path (other than nodes that are adjacent 
to v ). Now pick another node w that is adjacent (in G ) to u in level k. Find the 
shortest path from w to v in the current graph Ge(v). Combining these two paths 
with the arc (u, w) creates an odd hole. If the total length of this cycle is less 
than 1, then we have found a violated odd hole inequality. If not, we continue with 
another neighbor of u and repeat the process. We can then choose a node different 
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from u at level k. If no violated odd hole inequality is found at level k, we proceed 
to level k + 1. This subroutine can be repeated for different starting nodes (v) as 
well. 



16.4 The Effectiveness of the Set-Partitioning 
Formulation 

The effectiveness of this algorithm depends crucially on the quality of the initial 
lower bound; this lower bound is the optimal solution to the linear relaxation of 
Problem Sm- If this lower bound is not very tight, then the branch and bound 
or the constraint generation methods will most likely not be computationally ef- 
fective. On the other hand, when the gap between the lower bound and the best 
integer solution is small, the procedure will probably be effective. 

Fortunately, many researchers have reported that the linear relaxation of the 
set-partitioning problem, Problem Sm, provides a solution close to the optimal 
integer solution (see, e.g., Desrochers et al. (1992)). That is, the solution to the 
linear relaxation of Problem Sm provides a very tight lower bound on the solution 
of the VRP. For instance, in their paper, Desrochers et al. report an average relative 
gap between the optimal solution to the linear relaxation and the optimal integer 
solution of only 0.733%. A possible explanation for this observation is embodied in 
the following theorem which states that asymptotically the relative error between 
the optimal solution to the linear relaxation of the set-partitioning model and 
the optimal integer solution goes to zero as the number of customers increases. 
Consider again the general VRP with capacity and time window constraints. 

Theorem 16.4.1 Let the customer locations x\, X 2 , ■ • • , x n be a sequence of inde- 
pendent random variables having a distribution p with compact support in M 2 . Let 
the customer parameters (see Chapter 15) be independently and identically dis- 
tributed like <I>. Let Z LP be the value of the optimal fractional solution to S, and 
let Z* be the value of the optimal integer solution to S; that is, the value of the 
optimal solution to the VRP. Then 

lim — Z LF = lim — Z* ( a.s .). 

n—> oo Ti n—> oo fl 

The theorem thus implies that the optimal solution value of the linear program- 
ming relaxation of Problem S tends to the optimal solution of the vehicle routing 
problem as the number of customers tends to infinity. This is important since, as 
shown by Bramel and Simchi-Levi (1997) other classical formulations of the VRP 
can lead to diverging linear and integer solution values (see Exercise 16.8). 

In the next section we motivate Theorem 16.4.1 by presenting a simplified model 
which captures the essential ideas of the proof. Finally, in Section 16.4.2 we provide 
a formal proof of the theorem. Again, to simplify the presentation, we assume no 
time window constraints exist; for the general case, the interested reader is referred 
to Bramel and Simchi-Levi (1997). 
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16. 4-1 Motivation 

Define a customer type to be a location x € 1R 2 and a customer demand w: that 
is, a customer type defines the customer location and a value for the customer 
demand. Consider a discretized vehicle routing model in which there is a finite 
number W of customer types, and a finite number m of distinct customer locations. 
Let rij be the number of customers of type i, for i = 1, 2, . . . , W and let n = l n « 
be the total number of customers. Clearly, this discretized vehicle routing problem 
can be solved by formulating it as a set-partitioning problem. To obtain some 
intuition about the linear relaxation of S, we introduce another formulation of the 
vehicle routing problem closely related to S. 

Let a vehicle assignment be a vector (ai, a ?, . . . , aw), where a, > 0 are integers, 
and such that a single vehicle can feasibly serve ai customers of type 1, and <Z 2 cus- 
tomers of type 2, , and aw customers of type W together without violating the 

vehicle capacity constraint. Index all the possible vehicle assignments 1,2 , . . . ,R a 
and let c r be the total length of the shortest feasible route serving the customers 
in vehicle assignment r. (Note that R a is independent of n.) The vehicle routing 
problem can be formulated as follows. Let 

Ai r — number of customers of type i in vehicle assignment r, 
for each i = 1, 2, . . . , W and r = 1,2,..., R a . Let 

y r = number of times vehicle assignment r is used in the optimal solution. 
The new formulation of this discretized VRP is: 

Ra 

Problem Sn '■ Min y r c r 

r= 1 

S.t. 

Ra 

^ ' Ur Ai r A vii, Vi 1 , 2 ,..., w, 

r— 1 

Ur > 0 and integer, Vr = 1, 2, . . . , R a . 

Let Z x be the value of the optimal solution to Problem Sn and let Z]f be the 
optimal solution to the linear relaxation of Problem Sn- Clearly, Problem S and 
Problem Sn have the same optimal solution values; that is, Z* = Z ^ while their 
linear relaxations may be different. Define c = ma x r= i ) 2 ,...,H a {c r }; that is, c is the 
length of the longest route among the R a vehicle assignments. Using an analysis 
identical to the one in Section 5.2, we obtain: 

Lemma 16.4.2 



Z LP < < z)f + Wc < Z hP + Wc. 
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Observe that the upper bound on Z* obtained in Lemma 16.4.2 consists of 
two terms. The first, Z LP , is a lower bound on Z* , which clearly grows with the 
number of customers n. The second term ( Wc) is the product of two numbers that 
are fixed and independent of n. Therefore, the upper bound on Z* of Lemma 16.4.2 
is dominated by Z LP and consequently we see that for large n, Z* ss Z LP , exactly 
what is implied by Theorem 16.4.1. Indeed, much of the proof of the following 
section is concerned with approximating the distributions /r (customer locations) 
and $ (customer demands) with discrete distributions and forcing the number of 
different customer types to be independent of n. 

16-4-2 Proof of Theorem 8-4-1 

It is clear that Z LP < Z* and therefore lim ^ f(Z* — Z LP ) > 0. The interesting 
part is to find an upper bound on Z* that involves Z LP and use this upper bound 
to show that linin^oo -( Z * — Z LP ) < 0. We do this in essentially the same way as 
in Section 16.4.1. We successively discretize the problem by introducing a sequence 
of vehicle routing problems whose optimal solutions are “relatively” close to Z*. 
The last vehicle routing problem is a discrete problem which therefore, as in Sec- 
tion 16.4.1, can be directly related to the linear relaxation of its set-partitioning 
formulation. This linear program is also shown to have an optimal solution close 
to Z LP . 

To prove the upper bound, let N be the index set of customers, with \N\ = n, 
and let problem P be the original VRP. Let A be the compact support of the 
distribution of the customer locations (/i), and define d max == sup{||x|| : x £ A}, 
where ||x|| is the distance from point x £ A to the depot. Finally, pick a fixed 
k > 1. 

Discretization of the Locations 

We start by constructing the following vehicle routing problem with discrete 
locations. Define A = f and let G(A) be an infinite grid of squares of diag- 
onal A, that is, of side with edges parallel to the system coordinates. Let 
Ai,A 2 , . . . , An(A) be the subregions of G(A) that intersect A and have fx(Ai) > 0. 
Since A is bounded, ro( A) is finite for each A > 0. For convenience, we omit the 
dependence of m on A in the notation. For each subregion, let X t be the centroid 
of subregion Ai, that is, the point at the center of the grid square containing Aj. 
This defines m points X\, X 2 , - - - , X rn and note that a customer is at most y units 
from the centroid of the subregion in which it is located. 

Construct a new VRP, called P(?n), defined on the customers of N. Each of the 
customers in N is moved to the centroid of the subregion in which it is located. 
Let Z*(m) be the optimal solution to P(m). We clearly have 

Z* < Z*(m) + nA. (16.10) 



Discretization of the Customer Demands 
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We now describe a VRP where the customer demands are also discretized in 
much the same way as it is done in Section 5.2. Partition the interval (0, 1] into 
subintervals of size A(= )). This produces k segments and I = k — 1 points in the 
interval (0, 1) which we call corners. 

We refer to each centroid-corner pair as a customer type; each centroid defines 
a customer location and each corner defines the customer demand. It is clear that 
there are ml possible customer types. An instance of a fully discretized vehicle 
routing problem is then defined by specifying the number of customers of each of 
the ml types. 

For each centroid j = 1, 2, . . . , to, and corner i = 1, 2, . . . , J, let 



Nji = {h€N: 



< Wh < t and Xh € A, 
k 



■ 



Finally, for every j = 1, 2, ... , m, and i = 1, 2, let rip = |IVp|. 

We now define a fully discretized vehicle routing problem P k (m), whose optimal 
solution value is denoted Z^(m). The vehicle routing problem P k (m) is defined 
as having min{rijp customers located at centroid j with customer demand 

equal to for each i = 1,2,...,/ and j = 1, 2, . . . , m. 

We have the following result. 



Lemma 16.4.3 

771 I 

Z* (m) < Z^ (m) + 2d 

max EE I nji rijp_|_i | ■ 

j= i *=i 



Proof. Observe: 

(r) In Pfc(m), the number of customers at centroid j and with demand defined by 
corner i is mining, riyp+i}. 

(m) In P(m) each customer belongs to exactly one of the subsets Vp, for j = 
1, 2, . . . , to and r = 1, 2, . . . , /. 

(in) In P(m) the customers in iVp have smaller loads than the customers of Pk(m) 
at centroid j with demand defined by corner i. 

Given an optimal solution to Pk(m), let us construct a solution to P(m). For 
each centroid j = 1,2 , . . . , to and corner i = 1,2,...,/, we pick any max{?j,p — 
rijp+i, 0} customers from iVp and serve them in individual vehicles. The remaining 
min{njp n^p+i} customers in Nji can be served with exactly the same vehicle 
schedules as in P^ (to) . This can be done due to (in) and therefore one can always 
serve customers with demand of P(m) in the same vehicles that the customers of 
Pk(m) are served. I 

Now Pk(m) is fully discrete and we can apply results as in Section 16.4.1. Let 
Z^ p (m) be the optimal solution to the linear relaxation of the set-partitioning 
formulation of the routing problem Pk(m. ). Let c be defined as in Section 16.4.1; 
that is, it is the cost of the most expensive tour among all the possible routes in 
P k (m). 
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Lemma 16.4.4 

Z k (m) < Z k p (m) + mlc. 

Proof. Since the number of customer types is at most ml, we can formulate 
Pfc(m) as the integer program, like Problem Sn, described in Section 16.4.1, with 
ml constraints. The bound then follows from Lemma 16.4.2. I 

Recall that Z LP is the optimal solution to the linear relaxation of the set- 
partitioning formulation of the VRP defined by problem P. Then 

Lemma 16.4.5 

Zl P (m) < Z LP + nA. 

Proof. Let {y r : r = 1, 2, . . . , 1?} be the optimal solution to the linear relaxation of 
the set-partitioning formulation of problem P. We can assume (see Exercise 16.3) 
that ^ r=1 y r oii r = 1, for each i = 1,2 ,n. We construct a feasible solution to 
the linear relaxation of the set-partitioning formulation of Pk(m) using the values 
y r . Since every customer in P k (m) assigned to centroid j and corner i can be 
associated with a customer in P with x k € Aj and whose demand is at least as 
large, each route r with y r > 0 can be used to construct a route r' feasible for 
Pfc(m). Since in P k (m) the customers are at the centroids instead of at their original 
locations, we modify the route so that the vehicle travels from the customer to its 
centroid and back. Thus, the length (cost) of route r' is at most the cost of route 
r in P plus n r A where n r is the number of customers in route r. 

To create a feasible solution to the linear relaxation of the set-partitioning for- 
mulation to Pfc(m) we take the solution to the linear relaxation of P and create 
the routes r' as above. Therefore, 

R 

Z pp (m) < Z LP + ^2y r n r A < Z LP + nA. 

r= 1 

i 

We can now prove Theorem 16.4.1. 

Z* < Z*(m) + nA 

m I 

< Z* k (m) + 2d 

max EE | Tlji | "I - 

3 = 1 i= 1 

m I 

< Z k p (m) + mlc + 2 d max ^ ^ | - n jii+1 \ + nA 

i= 1 i= 1 

m I 

< Z hp + mlc + 2 d max ^2 E _ n i,i+A + 2nZ ^- 

i= i *= i 

We now need to show that Z LP is the dominant part of the last upper bound. 
We do that using the following lemma. 
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Lemma 16.4.6 There exists a constant K such that 

i 



lim - Y' ~y] | riji - n j i+1 \ < 

n — »oo Tl L L ' 



2 K 



3=1 i = 1 



Proof. In Section 5.2 we prove that given i and j there exists a constant K such 
that 

— 1 ,2 K 

_um -| nji - n jti+ 1 | < — . 



n—*oo Tl 

Therefore, a similar analysis gives 



— 1 ^ ^ , . , 2A^ 2 K 

^ ~ n i ' i+1 1 - Z, ^ IT = -r - ■ 



3=1 i=l 



3=1 



Finally observe that each tour in Pfc(m) has a total length no more than 1, since 
the truck travels at a unit speed and the length of each working day is 1. Hence, 
mlc = 0(1), and therefore, 

hm 1 (Z* - Z LP ) < 4d m * + 2A 

n —> oo n /C 

= — (2/C<i max + 1)- 

Since K is a constant and k was arbitrary, we see that the right-hand side can be 
made arbitrarily small. Therefore, 

0 < lim -(Z* - Z LP ) < lim -(Z* - Z LP ) < 0. 

n — >oo Tl n Tl 

We conclude this chapter with the following observation. The proof of Theo- 
rem 16.4.1 also reveals an upper bound on the rate of convergence of Z pp to its 
asymptotic value. Indeed (see Exercise 16.1), we have 

E(Z*) < E(Z hP ) + 0(n 3/4 ). (16.11) 



16.5 Exercises 



Exercise 16.1. Prove the upper bound on the convergence rate (equation (16.11)). 

Exercise 16.2. Consider an undirected graph G = (V. E) where each edge (i, j) 
has a cost Cij and each vertex i £ V a nonnegative penalty 7r,. In the Prize- 
Collecting Traveling Salesman Problem (PCTSP), the objective is to find a tour 
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that visits a subset of the vertices such that the length of the tour plus the sum 
of penalties of all vertices not in the tour is as small as possible. Show that the 
problem can be formulated as a Longest Path Problem between two prespecified 
nodes of a new network. 

Exercise 16.3. Consider the Bin-Packing Problem. Let Wi be the size of item i, 
i = 1 , ... ,n, and assume the bin capacity is 1. An important formulation of the 
Bin-Packing Problem is as a set-covering problem. Let 

F={S:J2^i< fl- 
ies 

Define 

f 1, if item i is in S, 

CX-iS = S 

y 0, otherwise, 

for each i = 1, 2, . . . , n and each S £ F. Finally, for any S, S G F, let 

f 1, if the items in S are packed in a single bin with no other items, 

\ 0, otherwise. 

In the set-covering formulation of the Bin-Packing Problem, the objective is to 
select a minimum number of feasible bins such that each item is included in some 
bin. It is the following integer program. 

Problem P : Min E ys 

sgf 
s.t. 

^ ys&is > 1, Vi = l,2,...,n (16.12) 

sgf 

ys e {0,1}, VSeF. 

Let Z* be the optimal solution to problem P and let Z LP be the optimal solution 
to the linear relaxation of Problem P. We want to prove that 



Z* < 2 Z LP . 



(16.13) 



(a) Formulate the dual of the linear relaxation of Problem P. 

(b) Show that Wi < Z LP . 

(c) Argue that Z* < 2 w i- Conclude that (16.13) holds. 

{<£) An alternative formulation to Problem P is obtained by replacing constraints 
(16.12) with equality constraints. Call the new problem Problem PE. Show 
that the optimal solution value of the linear relaxation of Problem P equals 
the optimal solution value of the linear relaxation of Problem PE. 
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Exercise 16.4. Recall the dynamic program given by equation (16.6). Let 

f = min min 

- i£Nwi<q<Q q 

Consider the function defined as follows. 



9q(i)= mi n{/g' (*) + /,_ q'+Wi m 

w i<q<q 



for each i £ N and Wi < q < Q. Now define g = min Wi < q <Q g q (i). Show 

that / = g. 



Exercise 16.5. Develop a dynamic programming procedure for the column gen- 
eration step similar to f q (i) that avoids two-loops (loops of the type ...i, 

What is the complexity of this procedure? 



Exercise 16.6. Develop a dynamic programming procedure for the column gen- 
eration step in the presence of time- window constraints. What is required of the 
time-window data in order for this to be possible? What is the complexity of your 
procedure? 



Exercise 16.7. Develop a dynamic programming procedure for the column gen- 
eration step in the presence of a distance constraint on the length of any route. 
What is required of the distance data in order for this to be possible? What is the 
complexity of your procedure? 

Exercise 16.8. Consider an instance of the VRPTW with n customers. Given a 
subset of the customers S, let b* ( S ) be the minimum number of vehicles required 
to carry the demands of customers in S; that is, b*(S) is the solution to the Bin- 
Packing Problem defined on the demands of all customers in S. For i = 1, 2, . . . , n 
and j = 1 , 2 , . . . , n, let 

f 1, if a vehicle travels directly between points i and j, 

x ij = < 

( 0, otherwise. 

Let 0 denote the depot and define c. b j as the cost of traveling directly between 
points i and j, for i,j = 0, 1, 2, ... , n. Let L represent the time a vehicle arrives 
at the location of customer i and for every i and j, such that i < j, define M t j = 
ma x{7j+djj — ej, 0} where dij = HY) — Yj\\. Then the following is a valid formulation 
of the VRPTW. 

Problem P' : Min E Cij %ij 

i<j 

s.t. Xij + Xji = 2, Vz = 1, 2, . . . , n, 
i<j i>j 

E X H ^ \ s \~ b *( s )’ VSC{1,2 n},2< \S\ < n - 1, 

i,jes 
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e* < ti < li — Si, 1 < i < n, 

t'i T S-i T d^j tj ” A/yy ( 1 Xij ) , 1 ft 2 ^ j ft 

Xij £ {0, 1}, 1 < i < j < n, (16.14) 

* 0j -G {0,1,2}, J = 1, 2, . . . , n. (16.15) 

The case xo j = 2 corresponds to a vehicle serving only customer j. The linear 
programming relaxation of P' is obtained by replacing constraints (16.14) and 
(16.15) by their linear equivalents. 

Construct an instance of the VRPTW in which the fractional and integer solu- 
tions to the above linear program do not approach the same value asymptotically. 
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17 

Network Planning 



17.1 Introduction 

In this chapter we present some of the issues involved in the practice of supply chain 
design and planning. These are issues that are often not dealt with in traditional 
operations research analyses. However, they are essential in transforming raw data 
and problem characteristics to modeling assumptions, input data and decisions. 

Our focus is on what we call network planning- the process by which the firm 
structures and manages the supply chain in order to 

• Find the right balance between inventory, transportation and manufacturing 
costs, 

• Match supply and demand under uncertainty by positioning inventory effec- 
tively, 

• Utilize resources effectively in a dynamic environment. 

Of course, this is a complex process, which requires a hierarchical approach in 
which decisions on network design, inventory positioning and management, as well 
as resource utilization are combined to reduce cost and increase service level. Thus, 
we divide the network planning process into three steps: 

1. Network design: this includes decisions on the number, locations and size 
of manufacturing plants and warehouses, the assignment of retail outlets to 
warehouses, etc. Major sourcing decisions are also made at this point and 
the typical planning horizon is a few years. 
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2. Inventory positioning: this includes identifying stocking points as well as 
selecting facilities that will produce to stock and thus keep inventory, and 
facilities that will produce to order and hence keep no inventory. 

3. Resource allocation: given the structure of the logistics network and the 
location of stocking points, the objective in this step is to determine when 
and how much to produce or purchase and where and when to store inven- 
tory. These decisions require identifying the optimal tradeoff between setup 
costs and times, and inventory and transportation costs, taking into account 
production, sourcing and warehousing capacities as well as other business 
rules and constraints. 

In this chapter we analyze each of these steps, and provide examples of the 
processes involved. 



17.2 Network Design 

Network design determines the physical configuration and infrastructure of the 
supply chain. As explained in Chapter 1, network design is a strategic decision that 
has a long-lasting effect on the firm. Network design involves decisions relating to 
plant and warehouse location as well as distribution and sourcing. 

The supply chain infrastructure typically needs to be reevaluated due to changes 
in demand patterns, product mix, production processes, sourcing strategies or the 
cost of running facilities. In addition, mergers and acquisitions may mandate the 
integration of different logistics networks. 

In the discussion below, we concentrate on the following key strategic decisions: 

1. Determining the appropriate number of facilities such as plants and ware- 
houses. 

2. Determining the location of each facility. 

3. Determining the size of each facility. 

4. Allocating space for products in each facility. 

5. Determining the production requirements in each plant. 

6. Determining sourcing requirements. 

7. Determining distribution. 

The objective is to design or reconfigure the logistics network in order to mini- 
mize annual system-wide cost, including production and purchasing costs, inven- 
tory holding costs, facility costs (storage, handling, and fixed costs), and trans- 
portation costs, subject to a variety of service level requirements. In this setting, 
the trade-offs are clear. Increasing the number of warehouses typically yields 
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• improvement in service level due to the reduction in average travel time to 
the customers. 

• An increase in inventory costs due to increased safety stocks required to 
protect each warehouse against uncertainties in customer demands. 

• An increase in overhead and setup costs. 

• A reduction in outbound transportation costs: transportation costs from the 
warehouses to the customers. 

• An increase in inbound transportation costs: transportation costs from the 
suppliers and/or manufacturers to the warehouses. 

In essence, the firm must balance the costs of opening new warehouses with the 
advantages of being close to the customer. Thus, warehouse location decisions are 
crucial determinants of whether the supply chain is an efficient channel for the 
distribution of products. 

We describe below some of the issues related to data collection and the cal- 
culation of costs required for the optimization models. Some of the information 
provided is based on logistics textbooks such as Ballou (1992), Johnson and Wood 
(1986) and Robeson and Copacino (1994). 

Figure 17.1 and Figure 17.2 present two screens of a typical Advance Planning 
system (APS); the user would see these screens at different stages of optimization. 
One screen represents the network prior to optimization and the other represents 
the optimized network. 



Data Collection 

A typical network configuration problem involves large amounts of data, includ- 
ing information on 

1. Locations of customers, retailers, existing warehouses and distribution cen- 
ters, manufacturing facilities, and suppliers. 

2. All products, including volumes, and special transport modes (e.g., refriger- 
ated). 

3. Annual demand for each product by customer location. 

4. Transportation rates by mode. 

5. Warehousing costs, including labor, inventory carrying charges, and fixed 
operating costs. 

6. Shipment sizes and frequencies for customer delivery. 

7. Order processing costs. 

8. Customer service requirements and goals. 




296 



17. Network Planning 



l-;»|x| 




FIGURE 17.1. The APS screen representing data prior to optimization 




FIGURE 17.2. The APS screen representing the optimized logistics network 
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9. Production and sourcing costs and capacities 



Data Aggregation 

A quick look at the above list suggests that the amount of data involved in any 
optimization model for this problem is overwhelming. For instance, a typical soft 
drink distribution system has between 10,000 and 120,000 accounts (customers). 
Similarly, in a retail logistics network, such as Wal-Mart or JC Penney, the number 
of different products that flow through the network is in the thousands or even 
hundreds of thousands. 

For that reason, an essential first step is data aggregation. This is carried out 
using the following procedure: 

1. Customers located in close proximity to each other are aggregated using a 
grid network or other clustering technique. All customers within a single cell 
or a single cluster are replaced by a single customer located at the center 
of the cell or cluster. This cell or cluster is referred to as a customer zone. 
A very effective technique that is commonly used is to aggregate customers 
according to the five-digit or three-digit zip code. Observe that if customers 
are classified according to their service levels or frequency of delivery, they 
will be aggregated together by classes. That is, all customers within the same 
class are aggregated independently of the other classes. 

2. Items are aggregated into a reasonable number of product groups, based on 

(a) Distribution pattern. All products picked up at the same source and des- 
tined to the same customers are aggregated together. Sometime there is 
a need to aggregate not only by distribution pattern but also by logistics 
characteristics, such as weight and volume. That is, consider all prod- 
ucts having the same distribution pattern. Within these products, we 
aggregate those SKUs with similar volume and weight into one product 
group. 

(b) Product type. In many cases, different products might simply be vari- 
ations in product models or style or might differ only in the type of 
packaging. These products are typically aggregated together. 

An important consideration, of course, is the impact on the model’s effectiveness 
of replacing the original detailed data with the aggregated data. We address this 
question in two ways. 

1. Even if the technology exists to solve the logistics network design problem 
with the original data, it may still be useful to aggregate data because our 
ability to forecast customer demand at the account and product levels is 
usually poor. Because of the reduction in variability achieved through ag- 
gregation, forecast demand is significantly more accurate at the aggregated 
level. 
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FIGURE 17.3. The APS screen representing data prior to aggregation 



2. Various researchers report that aggregating data into about 150 to 200 points 
usually results in no more than a 1 percent error in the estimation of total 
transportation costs; see Ballou (1992) and House and Jamie (1981). 

In practice, the following approach is typically used when aggregating the data: 

• Aggregate demand points for 150 to 200 zones. If customers are classified 
into classes according to their service levels or frequency of delivery, each 
class will have 150-200 aggregated points. 

• Make sure each zone has approximately an equal amount of total demand. 
This implies that the zones may be of different geographic sizes. 

• Place the aggregated points at the center of the zone. 

• Aggregate the products into 20 to 50 product groups. 

Figure 17.3 presents information about 3,220 customers all located in North 
America while Figure 17.4 shows the same data after aggregation using a three- 
digit zip code resulting in 217 aggregated points. 

Transportation Rates 

The next step in constructing an effective distribution network design model is to 
estimate transportation costs. An important characteristic of most transportation 
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FIGURE 17.4. The APS screen representing data after aggregation 



rates, including truck, rail, and others, is that the rates are almost linear with 
distance but not with volume. We distinguish here between transportation costs 
associated with an internal and an external fleet. 

Estimating transportation costs for company-owned trucks is typically quite 
simple. It involves annual costs per truck, annual mileage per truck, annual amount 
delivered, and the truck’s effective capacity. All this information can be used to 
easily calculate cost per mile per SKU. 

Incorporating transportation rates for an external fleet into the model is more 
complex. We distinguish here between two modes of transportation: truckload, 
referred to as TL, and less than truckload, referred to as LTL. 

In the United States, TL carriers subdivide the country into zones. Almost every 
state is a single zone, except for certain big states, such as Florida or New York, 
which are partitioned into two zones. The carriers then provide their clients with 
zone-to- zone table costs. This database provides the cost per mile per truckload 
between any two zones. For example, to calculate TL cost from Chicago, Illinois, 
to Boston, Massachusetts, one needs to get the cost per mile for this pair and 
multiply it by the distance from Chicago to Boston. An important property of the 
TL cost structure is that it is not symmetric; that is, it is typically more expensive 
to ship a fully loaded truck from Illinois to New York than from New York to 
Illinois. 

In the LTL industry, the rates typically belong to one of three basic types of 
freight rates: class, exception, and commodity. The class rates are standard rates 
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that can be found for almost all products or commodities shipped. They are found 
with the help of a classification tariff that gives each shipment a rating or a class. 
For instance, the railroad classification includes 31 classes ranging from 400 to 
13 that are obtained from the widely used Uniform Freight Classification. The 
National Motor Freight Classification, on the other hand, includes only 23 classes 
ranging from 500 to 35. In all cases, the higher the rating or class, the greater the 
relative charge for transporting the commodity. There are many factors involved 
in determining a product’s specific class. These include product density, ease or 
difficulty of handling and transporting, and liability for damage. 

Once the rating is established, it is necessary to identify the rate basis number. 
This number is the approximate distance between the load’s origin and destination. 
With the commodity rating or class and the rate basis number, the specific rate 
per hundred pounds (hundred weight, or cwt) can be obtained from a carrier tariff 
table (i.e., a freight rate table). 

The two other freight rates, namely exception and commodity, are specialized 
rates used to provide either less expensive rates (exception), or commodity-specific 
rates (commodity). For an excellent discussion, see Johnson and Wood (1986) and 
Patton (1994). Most carriers provide a database file with all of their transportation 
rates; these databases are typically incorporated into Advance Planning Systems. 

The proliferation of LTL carrier rates and the highly fragmented nature of 
the trucking industry has created the need for sophisticated rating engines. An 
example of such a rating engine that is widely used is SMC3’s RateWare (see 
www.smc3.com). This engine can work with various carrier tariff tables as well as 
SMC3’s CzarLite, one of the most widely used and accepted forms of nationwide 
LTL zip code-based rates. Unlike an individual carrier’s tariff, CZAR-Lite offers 
a market-based price list derived from studies of LTL pricing on a regional, in- 
terregional, and national basis. This provides shippers with a fair pricing system 
and prevents any individual carrier’s operational and marketing bias from overtly 
influencing the shipper choice. Consequently, CZAR.-Lite rates are often used as 
a base for negotiating LTL contracts between shippers, carriers, and third-party 
logistics providers. 

In Figure 17.5 we provide LTL cost charged by one carrier for shipping 4,000 
pounds as a function of the distance from Chicago. The cost is given for two 
classes, class 100 and class 150. As you can see, in this case, the transportation 
cost function is not linear with distance. 

Warehouse Costs 

Warehousing and distribution center costs include three main components: 

1. Handling costs. These include labor and utility costs that are proportional 
to annual flow through the warehouse. 

2. Fixed costs. These capture all cost components that are not proportional to 
the amount of material that flows through the warehouse. The fixed cost is 
typically proportional to warehouse size (capacity); the cost is a step-wise 
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FIGURE 17.5. Transportation Rates for Shipping 4000 lb. 

function of the warehouse capacity. That is, this cost is fixed in certain ranges 
of the warehouse size. 

3. Storage costs. These represent inventory holding costs, which are propor- 
tional to average positive inventory levels. 

Thus, estimating the warehouse handling costs is fairly easy while estimating the 
other two cost values is quite difficult. To see this difference, suppose that during 
the entire year 1,000 units of product are required by a particular customer. These 
1,000 units are not required to flow through the warehouse at the same time, 
so the average inventory level will likely be significantly lower than 1,000 units. 
Thus, when constructing the data for the APS, we need to convert these annual 
flows into actual inventory amounts over time. Similarly, annual flow and average 
inventory associated with this product tell us nothing about how much space is 
needed for the product in the warehouse. This is true because the amount of space 
that the warehouse needs is proportional to peak inventory, not annual flow or 
average inventory. 

An effective way to overcome this difficulty is to utilize the inventory turnover ra- 
tio. This is defined as Annual Sales divided by Average inventory level. Specifically, 
in our case the inventory turnover ratio is the ratio of the total annual outflow 
from the warehouse to the average inventory level. Thus, the average inventory 
level is total annual flow divided by the inventory turn over ratio. Multiplying 
the average inventory level by the inventory holding cost gives the annual stor- 
age costs. Finally, to calculate the fixed cost, we need to estimate the warehouse 
capacity. This is done in the next subsection. 

Warehouse Capacities 

Another important input to the distribution network design model is the actual 
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warehouse capacity. It is not immediately obvious, however, how to estimate the 
actual space required, given the specific annual flow of material through the ware- 
house. Again, the inventory turnover ratio suggests an appropriate approach. As 
before, annual flow through a warehouse divided by the inventory turnover ratio 
allows us to calculate the average inventory level. Assuming a regular shipment 
and delivery schedule, such as that given in Figure 6.1, it follows that the required 
storage space is approximately twice that amount. In practice, of course, every 
pallet stored in the warehouse requires an empty space to allow for access and 
handling; thus, considering this space as well as space for aisles, picking, sorting, 
and processing facilities, and AGVs (automatic guided vehicles), we typically mul- 
tiply the required storage space by a factor (> 1). This factor depends on the 
specific application and allows us to assess the amount of space available in the 
warehouse more accurately. A typical factor used in practice is three. This factor 
would be used in the following way. Consider a situation where the annual flow 
through the warehouse is 1,000 units and the inventory turnover ratio is 10.0. This 
implies that the average inventory level is about 100 units and, hence, if each unit 
takes 10 square feet of floor space, the required space for the products is 2,000 
square feet. Therefore, the total space required for the warehouse is about 6,000 
square feet. 

Potential Warehouse Locations 

It is also important to effectively identify potential locations for new warehouses. 
Typically, these locations must satisfy a variety of conditions: 

• Geographical and infrastructure conditions. 

• Natural resources and labor availability. 

• Local industry and tax regulations. 

• Public interest. 

As a result, only a limited number of locations would meet all the requirements. 
These are the potential location sites for the new facilities. 

Service Level Requirements 

There are various ways to define service levels in this context. For example, 
we might specify a maximum distance between each customer and the warehouse 
serving it. This ensures that a warehouse will be able to serve its customers within 
a reasonable time. Sometimes we must recognize that for some customers, such as 
those in rural or isolated areas, it is harder to provide the same level of service that 
most other customers receive. In this case, it is often helpful to define the service 
level as the proportion of customers whose distance to their assigned warehouse is 
no more than a given distance. For instance, we might require that 95 percent of 
the customers be situated within 200 miles of the warehouses serving them. 
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Future Demand 

As observed in Chapter 1, decisions at the strategic level, which include distribu- 
tion network design, have a long-lasting effect on the firm. In particular, decisions 
regarding the number, location, and size of warehouses have an impact on the firm 
for at least the next three to five years. This implies that changes in customer 
demand over the next few years should be taken into account when designing the 
network. This is most commonly addressed using a scenario-based approach in- 
corporating net present value calculations. For example, various possible scenarios 
representing a variety of possible future demand patterns over the planning hori- 
zon can be generated. These scenarios can then be directly incorporated into the 
model to determine the best distribution strategy. 

Model and Data Validation 

The previous subsections document the difficulties in collecting, tabulating, and 
cleaning the data for a network configuration model. Once this is done, how do we 
ensure that the data and model accurately reflect the network design problem? 

The process used to address this issue is known as model and data validation. 
This is typically done by reconstructing the existing network configuration using 
the model and collected data, and comparing the output of the model to existing 
data. 

The importance of validation cannot be overstated. Valuable output of the 
model configured to duplicate current operating conditions includes all costs- 
warehousing, inventory, production, and transportation-generated under the cur- 
rent network con- figuration. These data can be compared to the company’s ac- 
counting information. This is often the best way to identify errors in the data, 
problematic assumptions, modeling flaws, and so forth. 

In one project we are aware of, for example, the transportation costs calculated 
during the validation process were consistently underestimating the costs suggested 
by the accounting data. After a careful review of the distribution practices, the 
consultants concluded that the effective truck capacity was only about 30 percent 
of the truck’s physical capacity; that is, trucks were being sent out with very little 
load. Thus, the validation process not only helped calibrate some of the parameters 
used in the model but also suggested potential improvements in the utilization of 
the existing network. 

It is often also helpful to make local or small changes in the network configu- 
ration to see how the system estimates their impact on costs and service levels. 
Specifically, this step involves positing a variety of wlrat-if questions. This includes 
estimating the impact of closing an existing warehouse on system performance. Or, 
to give another example, it allows the user to change the flow of material through 
the existing network and see the changes in the costs. Often, managers have good 
intuition about what the effect of these small-scale changes on the system should 
be, so they can more easily identify errors in the model. Intuition about the effect 
of radical redesign of the entire system is often much less reliable. To summarize, 
the model validation process typically involves answering the following questions: 
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• Does the model make sense? 

• Are the data consistent? 

• Can the model results be fully explained? 

• Did you perform sensitivity analysis? 

Validation is critical for determining the validity of the model and data, but 
the process has other benefits. In particular, it helps the user make the connection 
between the current operations, which were modeled during the validation process, 
and possible improvements after optimization. 

Key Features of a Network Configuration APS 

One of the key requirements of any Advance Planning System for network design 
is flexibility. In this context, we define flexibility as the ability of the system to 
incorporate a large set of preexisting network characteristics. Indeed, depending 
on the particular application, a whole spectrum of design options may be appro- 
priate. At one end of this spectrum is the complete re-optimization of the existing 
network. This means that each warehouse can be either opened or closed and all 
transportation flows can be redirected. At the other end of the spectrum, it may 
be necessary to incorporate the following features in the optimization model: 

1. Customer-specific service level requirements. 

2. Existing warehouses. In most cases, warehouses already exist and their leases 
have not yet expired. Therefore, the model should not permit the closing of 
these warehouse. 

3. Expansion of existing warehouses. Existing warehouses may be expandable. 

4. Specific flow patterns. In a variety of situations, specific flow patterns (e.g., 
from a particular warehouse to a set of customers) should not be changed, 
or perhaps more likely, a certain manufacturing location does not or cannot 
produce certain SKUs. 

5. Warelrouse-to-warehouse flow. In some cases, material may flow from one 
warehouse to another warehouse. 

6. Production and Bill of materials. In some cases, assembly is required and 
needs to be captured by the model. For this purpose, the user needs to 
provide information on the components used to assemble finished goods. In 
addition production information down to the line level can be included in 
the model. 

It is not enough for the Advance Planning System to incorporate all of the 
features described above. It also must have the capability to deal with all these 
issues with little or no reduction in its effectiveness. The latter requirement is 
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directly related to the so-called robustness of the system. This stipulates that the 
relative quality of the solution generated by the system (i.e., cost and service level) 
should be independent of the specific environment, the variability of the data, or 
the particular setting. If a particular Advance Planning System is not robust, it is 
difficult to determine how effective it will be for a particular problem. 



17.3 Strategic Safety Stock 

An important question when designing the logistics network and when managing 
inventory in a complex supply chain is where to keep safety stock and similarly 
which facilities should produce to stock and which should produce to order? The 
answer to this question clearly depends on the desired service level, the supply 
network, lead times as well as a variety of operational issues and constraints. 
Thus, our focus is on a strategic model that allows the firm to position safety 
stock effectively in its supply chain. 

To illustrate the trade-offs and the impact of strategically positioning safety 
stock in the supply chain, consider the following example. 

ElecComp Inc. is a large contract manufacturer of circuit boards and other high 
tech parts. The company sells about 27,000 high value products whose life cycle 
is relatively short. Competition in this industry forces ElecComp to commit to 
short lead times to its customers; this committed service time to the customers is 
typically much shorter than manufacturing lead time. Unfortunately, the manu- 
facturing process is quite complex including a complex sequence of assemblies at 
different stages. 

Because of the long manufacturing lead time and the pressure to provide cus- 
tomers with a short response time, ElecComp kept inventory of finished products 
for many of its SKUs. Thus, the company managed its supply chain based on 
long-term forecast, the so-called Push-based supply chain strategy. This Make- 
to-Stock environment required the company to build safety stock and resulted in 
huge financial and shortage risks. 

Executives at ElecComp had long recognized that this Push-based supply chain 
strategy was not the appropriate strategy for their supply chain. Unfortunately, 
because of the long lead time, a Pull-based supply chain strategy, in which man- 
ufacturing and assembly is done based on realized demand, was not appropriate 
either. 

Thus, ElecComp focused on developing a new supply chain strategy whose ob- 
jectives are: 

1. Reducing inventory and financial risks 

2. Providing customers with competitive response times. 

This could be achieved by 

• Determining the optimal location of inventory across the various stages of 
the manufacturing and assembly process. 
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• Calculating the optimal quantity of safety stock for each component at each 
stage. 

Thus, the focus of redesigning ElecComp’s supply chain was on a hybrid strategy 
in which portion of the supply chain is managed based on Push, i.e., a Make-to- 
Stock environment while the remaining portion of the supply chain is managed 
based on Pull, i.e., a Make-to-Order strategy. Evidently, the supply chain stages 
that produce to stock will be the locations where the company keeps safety stock, 
while the Make-to-Order stages will keep no stock at all. Hence, the challenge was 
to identify the location in the supply chain in which the strategy is switched from 
a Push based, i.e., a Make-to Stock strategy, to a Pull based, i.e., a Make-to-Order 
supply chain strategy. This location is referred to as the Push-Pull boundary. 

ElectComp developed and implemented the new Push-Pull supply chain strat- 
egy, and the impact was dramatic! For the same customer lead times, safety stock 
was reduced by 40 to 60 percent, depending on product line. More importantly, 
with the new supply chain structure, ElecComp concluded that they could cut 
lead times to their customers by 50 percent and still enjoy a 30 percent reduction 
in safety stock. 

Below we describe how this was achieved for a number of product lines. 

An Illustrative Example 

To understand the analysis and the benefit experienced by ElecComp, consider 
Figure 17.6 in which a finished product (part 1) is assembled in a Dallas facility 
from two components, one produced in the Montgomery facility and one in a 
different facility in Dallas. Each box provides information about the value of the 
product produced by that facility; numbers under each box are the processing 
time at that stage; bins represent safety stock. Transit times between facilities are 
provided as well. Finally, each facility provides committed response time to the 
downstream facilities. For instance, the assembly facility quotes 30 days response 
time to its customers. This implies that any order can be satisfied in no more 
than 30 days. The Montgomery facility quotes an 88 day response time to the 
assembly facility. As a result, the assembly facility needs to keep inventory of 
finished products in order to satisfy customer orders within its 30 days committed 
service time. 

Observe that if somehow ElecComp can reduce the committed service time from 
the Montgomery facility to the assembly facility from 88 days to say 50 or perhaps 
40 days, the assembly facility will be able to reduce its finished good inventory 
while the Montgomery facility will need to start building inventory. Of course, 
ElecComp’s objective is to minimize system wide inventory and manufacturing 
costs; this is precisely what Inventory Analyst from LogicTools allows users to 
do. By looking at the entire supply chain, the tool determines the appropriate 
inventory level at each stage. 

For instance, if the Montgomery facility reduces its committed lead time to 13 
days, then the assembly facility does not need any inventory of finished goods. 
Any customer order will trigger an order for parts 2 and 3. Part 2 will be available 
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FIGURE 17.6. How to read the diagrams 



immediately, since the facility producing part 2 holds inventory, while part 3 will 
be available at the assembly facility in 15 days; 13 days committed response time 
by the manufacturing facility plus 2 days transportation lead time. It takes another 
15 days to process the order at the assembly facility and therefore the order will 
be delivered to the customers within the committed service time. Thus, in this 
case, the assembly facility produces to order, i.e., a Pull based strategy while the 
Montgomery facility needs to keep inventory and hence is managed based on Push, 
that is, a Make-to-Stock strategy. 

Now that the trade-offs are clear, consider the product structure depicted in 
Figure 17.7. Light boxes (part 4, 5 and 7) represent outside suppliers whereas 
dark boxes represent internal stages within ElecComp’s supply chain. Observe 
that the assembly facility commits a 30 days response time to the customers and 
keeps inventory of finished goods. More precisely, the assembly facility and the 
facility manufacturing part 2 both produce to stock. All other stages produce to 
order. 

Figure 17.8 depicts the optimized supply chain that provides customers with 
the same 30 day response time. Observe that by adjusting committed service time 
of various internal facilities, the assembly system starts producing to order and 
keeps no finished good inventory. On the other hand, the Raleigh and Montgomery 
facilities need to reduce their committed service time and hence keep inventory. 

So where is the Push and where is the Pull in the optimized strategy? Evidently, 
the assembly facility and the Dallas facility that produces part 2 both operate now 
in a Make-to-Order fashion, i.e., a Pull strategy while the Montgomery facility 
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FIGURE 17.7. Current Safety Stock Locations 
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FIGURE 17.8. Optimized Safety Stock Locations 




17.3 Strategic Safety Stock 309 




U~- 



8 



FIGURE 17.9. Optimized Safety Stock with Reduced Lead Time 



operates in a Make-to Stock fashion, a Push based strategy. The impact on the 
supply chain is a 39 percent reduction in safety stock! 

At this point it was appropriate to analyze the impact of a more aggressive 
quoted lead time to the customers. That is, ElecComp executives considered re- 
ducing quoted lead times to the customers from 30 days to 15 days. Figure 17.9 
depicts the optimized supply chain strategy in this case. The impact was clear. 
Relative to the base-line (Figure 17.7) inventory was down by 28 percent while 
response time to the customers is halved. 

Finally, Figures 17.10 and 17.11 present a more complex product structure. Fig- 
ure 17.10 provides information about the supply chain strategy before optimization 
and Figure 17.11 depicts the supply chain strategy after optimizing the Push-Pull 
boundary as well as inventory levels at different stages in the supply chain. Again, 
the benefit is clear. By correctly selecting which stage is going to produce to or- 
der and which is producing to stock, inventory cost was reduced by more than 60 
percent while maintaining the same quoted lead time to the customers. 

Summary 

Using a multi-stage inventory optimization technology (Inventory Analyst from 
LogicTools), ElecComp was able to significantly reduce inventory cost while main- 
taining and sometimes significantly decreasing quoted service times to the cus- 
tomers. This is achieved by 

1. Identifying the Push-Pull boundary; that is, identifying supply chain stages 
that should operate in a Make-to-Stock fashion and hence keep safety stock. 
The remaining supply chain stages operate in a Make-to-order fashion and 
thus keep no inventory. This is done by pushing inventory to less costly 
locations in the supply chain. 
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Safety Stock Cost vs. Quoted Lead Time 




FIGURE 17.12. Global vs. local optimization 



2. Taking advantage of the risk pooling concept. This concept suggests that 
demand for a component used by a number of finished products has smaller 
variability and uncertainty than that of the finished goods. 

3. Replacing traditional supply chain strategies that are typically referred to 
as sequential, or local, optimization by a globally optimized supply chain 
strategy. In a sequential, or local, optimization strategy, each stage tries to 
optimize its profit with very little regards to the impact of its decisions on 
other stages in the same supply chain. On the other hand, in a global supply 
chain strategy, one considers the entire supply chain and identifies strategies 
for each stage that will maximize supply chain performance. 

To better understand the impact of the new supply chain paradigm employed by 
ElecComp, consider Figure 17.12 where we plot total inventory cost against quoted 
lead time to the customers. The upper trade-off curve represents the traditional 
relationship between cost and quoted lead time to the customers. This curve is a 
result of locally optimizing decisions at each stage in the supply chain. The lower 
trade-off curve is the one obtained when the firm globally optimizes the supply 
chain by locating correctly the Push-Pull boundary. 

Observe that this shift of the trade-off curve, due to optimally locating the 
Push-Pull boundary, implies: 

1. For the same quoted lead time, the company can significantly reduce cost, 
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or 

2. For the same cost, the firm can significantly reduce lead time. 

Finally, notice that the curve representing traditional relationship between cost 
and customer quoted lead time is smooth while the new trade-off curve representing 
the impact of optimally locating the Push-Pull boundary is not, with jumps in 
various places. These jumps represent situations in which the location of the Push- 
Pull boundary changes and significant cost savings are achieved. 

Our experience is that those employing the new supply chain paradigm, like 
ElecComp, typically chose a supply chain strategy that both reduce cost and cus- 
tomer quoted lead time. This strategy allows ElecComp to satisfy demand much 
faster than their competitors and develop a cost structure that enables competitive 
pricing. 



17.4 Resource Allocation 

Supply chain master planning is defined as the process of coordinating and allo- 
cating production, and distribution strategies and resources to maximize profit or 
minimize system- wide cost. In this process, the firm considers forecast demand for 
the entire planning horizon, e.g., the next fifty-two weeks, as well as safety stock 
requirements. The later are determined, for instance, based on models similar to 
the one analyzed in the previous section. 

The challenge of allocating production, transportation and inventory resources 
in order to satisfy demand can be daunting. This is especially true when the firm 
is faced with seasonal demand, limited capacities, competitive promotions or high 
volatility in forecasting. Indeed, decisions such as when and how much to produce, 
where to store inventory, and whether to lease additional warehouse space may 
have enormous impact on supply chain performance. 

Traditionally, the supply chain planning process was performed manually with 
a spreadsheet and was done by each function in the company independently of 
other functions. That is, the production plan would be determined at the plant, 
independently from the inventory plan, and would typically require the two plans 
to be somehow coordinated at a later time. This implies that divisions typically 
end up “optimizing” just one parameter, usually production costs. 

In modern supply chains, however, this sequential process is replaced by a pro- 
cess that takes into account the interaction between the various levels of the supply 
chain and identifies a strategy that maximizes supply chain performance. This is 
referred to as global optimization and it necessitates the need for an optimization- 
based Advance Planning System. These systems, which model the supply chain as 
large-scale mixed integer linear programs, are analytical tools capable of consider- 
ing the complexity and dynamic nature of the supply chain. 

Typically, the output from the tool is an effective supply chain strategy that 
coordinates production, warehousing, transportation, and inventory decisions. The 
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FIGURE 17.13. The extended Supply Chain: From manufacturing to order fulfillment 



resulting plan provides information on production quantities, shipment sizes and 
storage requirements by product, location and time period. This is referred to as 
the supply chain master plan. 

In some applications, the supply chain master plan serves as an input for a 
detailed production scheduling system. In this case, the production scheduling 
system employs information about production quantities and due dates received 
from the supply chain master plan. This information is used to propose a detailed 
manufacturing sequence and schedule. This allows the planner to integrate the 
back-end of the supply chain, i.e. , manufacturing and production, and the front- 
end of the supply chain, i.e., demand planning and order replenishment, see Figure 
17.13. This diagram illustrates an important issue. The focus of order replenish- 
ment systems, which are part of the pull portion of the supply chain, is on service 
level. Similarly, the focus of the tactical planning, i.e., the process by which the 
firm generates a supply chain master plan, which is in the push portion of the 
supply chain, is on cost minimization or profit maximization. Finally, the focus in 
the detailed manufacturing scheduling portion of the supply chain is on feasibility. 
That is, the focus is on generating a detailed production schedule that satisfies all 
production constraints and meets all the due date requirements generated by the 
supply chain master plan. 

Of course, the output from the tactical planning process, i.e., the supply chain 
master plan, is shared with supply chain participants to improve coordination and 
collaboration. For example, the distribution center managers can now better use 
this information to plan their labor and shipping needs. Distributors can share 
plans with their suppliers and customers in order to decrease costs for all partners 
in the supply chain and promote savings. Specifically, distributors can realign 
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territories to better serve customers, store adequate amounts of inventory at the 
customer site and coordinate overtime production with suppliers. 

In addition, supply chain master planning tools can identify potential supply 
chain bottlenecks early in the planning process, allowing the planner to answer 
questions such as: 

• Will leased warehouse space alleviate capacity problems? 

• When and where should the inventory for seasonal or promotional demand 
be built and stored? 

• Can capacity problems be alleviated by re-arranging warehouse territories? 

• What impact do changes in the forecast have on the supply chain? 

• What will be the impact of running overtime at the plants or out-sourcing 
production? 

• What plant should replenish each warehouse? 

• Should the firm ship by sea or by air. Shipping by sea implies long lead 
times and therefore requires high inventory levels. On the other hand, using 
air carriers reduces lead times and hence inventory levels but significantly 
increases transportation cost. 

• Should we rebalance inventory between warehouses or replenish from the 
plants to meet unexpected regional changes in demand? 

Another important capability that tactical planning tools have is the ability to 
analyze demand plans and resource utilization to maximize profit. This enables 
balancing the effect of promotions, new product introductions and other planned 
changes in demand patterns and supply chain costs. Planners now are able to 
analyze the impact of various pricing strategies as well as identify markets, stores 
or customers that do not provide the desired profit margins. 

A natural question is when should one focus on cost minimization and when on 
profit maximization? While the answer to this question may vary from instance to 
instance, it is clear that cost minimization is important when the structure of the 
supply chain is fixed or at times of a recession and therefore oversupply. In this 
case the focus is on satisfying all demand at the lowest cost by allocating resources 
effectively. On the other hand, profit maximization is important at time of growth, 
i.e., at time when demand exceeds supply. In this case, capacity can be limited 
because of use of limited natural resources or because of expensive manufacturing 
processes that are hard to expand as is the case in the chemical and electronic 
industries. In these cases, deciding who to serve and for how much is more critical 
than cost savings. 

Finally, an effective supply chain master planning tool must also be able to help 
the planners improve the accuracy of the supply chain model. This, of course, is 
counter-intuitive since the accuracy of the supply chain master planning model 
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depends on the accuracy of the demand forecast that is an input to the model. 
However, notice that the accuracy of the demand forecast is typically time depen- 
dent. That is, the accuracy of forecast demand for the first few time periods, e.g., 
the first ten weeks, is much higher than the accuracy of demand forecast for later 
time periods. This suggests, that the planner should model the early portion of the 
demand forecast at a great level of detail, i.e., apply weekly demand information. 
On the other hand, demand forecasts for later time periods are not as accurate and 
hence the planner should model the later demand forecast month by month or by 
groups of 2-3 weeks each. This implies that later demand forecasts are aggregated 
into longer time buckets and hence, due to the risk pooling concept, the accuracy 
of the forecast improves. 

In summary, supply chain master planning helps address fundamental trade- 
offs in the supply chain such as set up cost versus holding costs or production lot 
sizes versus capacities. It takes into account supply chain costs such as production, 
supply, warehousing, transportation, taxes and inventory as well as capacities and 
changes in the parameters over time. 

This example illustrates how supply chain master planning can be used dynami- 
cally and consistently to help a large food manufacturer manage the supply chain. 
The food manufacturer makes production and distribution decisions at the divi- 
sion level. Even at the division level, the problems tend to be large-scale. Indeed, 
a typical division may include hundreds of products, multiple plants, many pro- 
duction lines within a plant, multiple warehouses (including overflow facilities), 
bill-of-material structures to account for different packaging options, and a 52- 
week demand forecast for each product for each region. The forecast accounts for 
seasonality and planned promotions. The annual forecast is important because a 
promotion late in the year may require production resources relatively early in the 
year. Production and warehousing capacities are tight and products have limited 
shelf life that need to be integrated into the analysis. Finally, the scope of the plan 
spans many functional areas, including purchasing, production, transportation, 
distribution, and inventory management. Traditionally, the supply chain planning 
process was performed manually with a spreadsheet and was done by each function 
in the company. That is, the production plan would be done at the plant, indepen- 
dently from the inventory plan, and would typically require the two plans to be 
somehow coordinated at a later time. This implies that divisions typically end up 
“optimizing” just one parameter, usually production costs. The tactical planning 
APS introduced in the company allows the planners to reduce systemwide cost and 
better utilize resources such as manufacturing and warehousing. Indeed, a detailed 
comparison of the plan generated by the tactical tool with the spreadsheet strategy 
suggests that the optimization-based tool is capable of reducing total costs across 
the entire supply chain. See Figure 17.14 for illustrative results. 
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FIGURE 17.14. Comparison of manual versus optimized scenarios 



17.5 Summary 

Optimizing supply chain performance is difficult because of conflicting objectives, 
because of demand and supply uncertainties and due to supply chain dynamics. 
However, through network planning, which combines network design, inventory 
positioning and resource allocation, the firm can globally optimize supply chain 
performance. This is achieved by considering the entire network, taking into ac- 
count production, warehousing, transportation and inventory costs as well as ser- 
vice level requirements. 

Table 17.1 summarizes the key dimensions of each of the planning activities, 
network design, inventory positioning/management and supply chain master plan- 
ning. The table shows that network design involves long term plans, typically over 
years, is done at a high level and can yield high returns. The planning horizon 
for supply chain master planning is months or weeks, the frequency of re planning 
is high, e.g., every week, and it typically deliver quick results as well. Inventory 
planning is focused on short term uncertainty in demand, lead time, processing 
time or supply. The frequency of re planning is high, e.g., monthly planning to 
determine appropriate safety stock based on the latest forecast and forecast error. 
Inventory planning can also be used more strategically to identify locations in the 
supply chain where the firm keeps inventory, as well as identify stages that produce 
to stock and those that produce to order. 
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Network 

Design 


Inventory 

Positioning 


Resource 

Allocation 


Decision 

focus 


Infrastructure 


Safety 

Stock 


Production 

Distribution 


Planning 

horizon 


Years 


Months 


Months 


Aggregation 

level 


Family 


Item 


Classes 


Frequency 


Yearly 


Monthly/ 

Weekly 


Monthly/ 

Weekly 


Return on 
Investment 


High 


Medium 


Medium 


Implementation 


Very 

Short 


Short 


Short 


User 


Very 

few 


few 


few 



TABLE 17.1. Network Planning Characteristics 

17.6 Exercises 



Exercise 17.1. Consider n independent and identically distributed random vari- 
ables, Xi, X 2 , . . . , X n . Let S n = 1 E” =1 X t . Find the variance of the random 
variable S n as a function of the variance of Xi . 
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18.1 Introduction 

We now turn our attention to a case study in transportation logistics. We highlight 
particular issues that arise when implementing an optimization algorithm in a real- 
life routing situation. The case concerns the routing and scheduling of school buses 
in the five boroughs of New York City. 

Many of the vehicle routing problems we have discussed so far (see Part II) 
have been simplified versions of the usually more complex problems that appear 
in practice. Typically, a vehicle routing problem will have many constraints on 
the types of routes that can be constructed including multiple vehicle types, time 
and distance constraints, complex restrictions on what items can be in a vehicle 
together, etc. The problems that appear in the context of school bus routing and 
scheduling could be characterized as the most difficult types of vehicle routing 
problems since they have aspects of all these constraints. This is the problem we 
will consider here. 

School bus routing and scheduling is an area where, in general, computerized 
algorithms can have a large impact. User-friendly software that call routing and 
scheduling algorithms at the click of a button and that result in workable so- 
lutions can greatly affect the day-to-day operations of a dispatching unit. With 
increasingly affordable high-speed computing power in desktop computers and the 
possibility of displaying geographic information on-screen, it is not surprising that 
many communities are using expert systems to perform the daunting task of rout- 
ing and scheduling their school buses. In most cases, this has led to improved 
solutions in fractions of the time that was previously required. 
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Unfortunately, providing workable solutions for such an application as this is 
not as simple as just “clicking” the right button. Anyone who has been involved 
in a real-life optimization application knows that much discussion is involved in 
determining what the problem is and how we are to “solve” it. In this chapter we 
concern ourselves with some of the details that make it possible to put modeling 
assumptions and algorithms into action. 



18.2 The Setting 

The New York City school system is composed of 1,069 schools and approximately 
one million students. Most of these students either walk to school or are given pub- 
lic transportation passes. About 125,000 students ride school buses that are leased 
by the Board of Education. The majority, or about 83,000, of these are classified 
as General Education students. These students walk to their neighborhood bus 
stop in the morning and wait for a bus to take them to school. In the afternoon, a 
bus takes them from their school and drops them off back at their bus stop. The 
rest of these students with particular needs, classified as Special Education, are 
picked up and dropped off directly at their homes. 

This is one distinction that makes the transportation policies governing Special 
Ed students fundamentally different from those of General Ed students. Another 
fundamental difference is that, in many cases, Special Ed students enroll in schools 
with specific services and therefore may be bused over long distances. General Ed 
students usually go to schools only a few miles from their homes and almost 
exclusively to schools within the same borough. In addition, Special Ed students, 
such as wlreelchair-bound students, are transported in specially designed vehicles 
with much smaller carrying capacities. 

For General Ed student transportation, currently the Board of Education leases 
approximately 1,150 buses a year. Many companies bid for the contract to trans- 
port the students and currently the companies winning contracts are responsible 
for designing the routes. Independent of the company, the leasing cost to the Board 
is approximately $80,000 annually for each bus (and driver). The total yearly bud- 
get for General Education student transportation alone is therefore close to $100 
million. 

The routing of Special Education students is done differently. Using colored pins 
and large maps placed on walls, a team of inspectors/routers at the Board of Edu- 
cation Office of Pupil Transportation mark the students’ homes and schools. Then, 
using their knowledge of the geography and street conditions acquired through 
their many years of work, they literally string pins together to form routes. Al- 
though the inspectors clearly do this well, this is very time consuming. For exam- 
ple, a group of five people took approximately three months to manually generate 
routes just for the Borough of Manhattan. 

Several years ago, the New York City Board of Education appropriated funds to 
develop a computerized system, called CATS (Computer- Assisted Transportation 
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System). This system is supposed to help in the design of routes for both the 
General and Special Ed students. The project consists of three phases. 

Phase I: Replicate the pinning and stringing approach on a computer. The purpose 
of this phase is to emulate on the computer screen what was previously done with 
maps, pins and string. First, a database is needed to keep track of all relevant 
student and school information. The student data consist of address, bus stop and 
school. For each school, the data consist of an address, as well as starting and 
ending times for all sessions. This makes data easily retrievable and updatable, 
and provides some of the basic information that is needed for routing and schedul- 
ing. In addition to the database, a method of generating maps on the computer 
is needed as well; this is the geographic information system (GIS). These systems, 
widely available only in the last few years, truly offer a new dimension to many 
decision-support systems. With this software, color-coded objects designating stu- 
dents or schools can easily be displayed on a computer screen. This enables the 
user to visualize the relative locations of important points. In addition, the user 
can “click and drag” with a mouse and get information about the area outlined. 
This information can include U.S. Census data such as number of households, me- 
dian age, income, etc. More importantly, in this application, by designating two 
points, the GIS can calculate exact locations (latitude and longitude coordinates) 
and also the distance between the two points along the street network. By “string- 
ing” together a series of points, the software can give the total distance traveled. 
When this phase is completed, inspectors currently designing Special Ed routes 
will be able to “click” on bus stops with a mouse and “string” them together on 
the computer screen. This is the method called “blocking and stringing.” 

Phase II: Extend the functionality developed in Phase I to the General Education 
stop-to-school service. The goal is to create a system whereby one could construct 
routes for the General Ed population on the computer screen. For example, by 
choosing a set of schools with a mouse, the pertinent bus stops (those with stu- 
dents going to the set of schools) are highlighted. The inspector can then string 
together the stops and schools to form a route directly on the computer screen, or 
again let the computer determine a good route through the stops. The immediate 
visualization of a possible solution (routes) along with relevant statistics (bus load, 
total travel time, total students picked up) makes it much easier to check feasibil- 
ity of the routes. This alone considerably simplifies the task of building efficient 
routes. 

Phase III: Create an optimization module. The aim here is to build software that 
uses the student and school data and the GIS to generate efficient bus routes and 
schedules meeting existing transportation policies. The software should include 
subroutines that check feasibility of suggested routes or design routes for any 
subset of the population, be it a school, a district, a borough or the entire city. 
This is the phase in which we are the most interested. 
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We present here a range of issues related to the development of this optimization 
module (Phase III) and to the problem of routing buses through the New York City 
streets. We focus on routing the General Ed students; the routing of Special Ed 
is currently being done at the Office of Pupil Transportation using the “blocking 
and stringing” approach. 

In Section 18.3, we give a short summary of some of the important papers that 
have appeared in the literature in the area of school bus routing and scheduling 
and related vehicle routing problems. In Section 18.4, we present details of the 
school bus routing and scheduling problem in Manhattan. In Section 18.5, we give 
a brief overview of methodologies we used to estimate distances, travel times and 
the pickup and dropoff times. 

In designing a computerized system for this problem it is important to consider 
the following questions. First, is it possible to design an algorithm that will gen- 
erate quality solutions in a reasonable amount of computing time? Second, are 
routes constructed by the computerized system truly driveable ? Third, what is the 
best way to make these computerized algorithms of use to the people designing 
the routes? To answer the first two questions, we designed a school bus routing 
and scheduling algorithm and ran it on the Manhattan data. The algorithm is 
presented in Section 18.6. To answer the third question, in Section 18.8 we discuss 
some of the ways in which a computerized system for school bus routing can be 
made more interactive. In Section 18.9, we present results on the Manhattan data. 



18.3 Literature Review 

In the operations research literature, we find quite a few references to the problem 
as well as many different solution techniques. A standard way the school bus 
routing and scheduling problem has been analyzed is to decompose it into two 
problems: a route generation problem where simple routes are designed (usually 
with only one school), and a route scheduling problem where these routes are 
linked to form longer routes (routes that visit more than one school). 

As early as 1969, Newton and Thomas looked at a bus routing problem for a sin- 
gle school. Using some of the first local improvement procedures for vehicle routing 
problems, they designed a tour through all the bus stops and then partitioned it 
into smaller feasible routes that each could be covered by a bus. 

In 1972, Angel et al. considered a clustering approach to generating routes. First, 
bus stops are grouped by their proximity using a clustering algorithm. Then an 
attempt is made to find minimum length routes through these clusters in such 
a way that the constraints are satisfied. Finally, some clusters are merged if this 
is feasible. The algorithm was applied to an instance consisting of approximately 
1,500 students and 5 schools in Indiana. 

In 1972, Bennett and Gazis considered the problem of generating routes. They 
modified the Savings Algorithm of Clarke and Wright (1964) (see Section 14.2). 
They also experimented with different objective functions such as minimizing total 
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student-miles. The problem considered had 256 bus stops and approximately 30 
routes in Toms River, New Jersey. 

In 1979, Bodin and Berman used a 3-opt procedure to generate an initial trav- 
eling salesman tour which is then partitioned into feasible routes. This algorithm 
uses two additional components: a lookahead feature and a bus stop splitter. The 
lookahead feature allows the initial order to be changed slightly. The bus stop 
splitter allows a bus stop to be split into smaller bus stops. Two problems were 
studied. One dealt with a school district in a densely populated suburban area with 
13,000 students requiring bus transportation each day and 25 schools. A second 
district, also in a suburban area, had 4,200 students transported. 

In 1984, Swersey and Ballard addressed only the problem of scheduling a set 
of routes that had already been designed. Given a set of routes that delivered all 
students from their bus stops to their schools, the authors devised a method to find 
the minimum number of buses that could “cover” these routes. This scheduling 
problem can be formulated as a difficult integer program. The authors used some 
simple cutting planes to solve it heuristically. The size of the problem considered 
was approximately 30-38 buses and 100 routes. 

Finally, in 1986, Desrosiers et al. studied a bus routing problem in Montreal, 
Canada. Using several techniques, depending on whether the stops were in rural or 
urban areas, they generated a set of routes. To schedule them, they formulated the 
problem as an integer program and solved it using a column generation approach. 
The problem solved had 60 schools and 20,000 students. 



18.4 The Problem in New York City 

The School Bus Routing and Scheduling Problem can take many forms depending 
on how generally it is formulated. In its most general form, the problem consists of 
a set of students distributed in a region who have to be brought to and from their 
schools every school day. The problem consists of determining bus stop locations, 
assigning students to bus stops, and finally routing and scheduling the buses so 
as to minimize total operating cost while following all transportation guidelines. 
The difficulty, of course, is that each of these subproblems are dependent and 
therefore should be looked at simultaneously. That is, any determination of bus 
stop locations, and who gets assigned to each, clearly has an impact on the routes 
and schedules of the buses. Hence, an integrated approach is required to avoid 
suboptimality. However, due to the complexity and the size of the problem this has 
historically never been attempted. In addition, often it is not necessarily possible 
to reoptimize all aspects of the problem, such as bus stop locations or assignments. 

To understand why this problem is so complex, consider for instance the bus stop 
location problem on its own. There are numerous constraints and requirements: 
no more than a certain number of students can be assigned to the same bus stop; 
bus stops cannot be within a certain distance of each other; each student must be 
within a short walk of the bus stop and must not cross a major thoroughfare, etc. 
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In our case, the Board of Education decided that the bus stops that are currently 
being used will remain in use. Thus, the position of the bus stops as well as 
which students are assigned to each was assumed fixed. These stops satisfy all the 
requirements mentioned above. Our routing and scheduling problem thus starts 
with a set of bus stops, each with a particular number of students assigned to it 
destined for a particular school. Each school has starting and ending times for each 
session. In addition to bus stop and school data, it is assumed that distance and 
travel time between any two points in the area are readily available. This issue 
will be discussed in more detail in Section 18.5. 

We formally define a route as follows. A route is a sequence of stops and possibly 
several schools that can be feasibly driven by one bus. For example, routes for the 
morning problem always start with a pickup at a stop and end with dropoff at a 
school. In contrast, an afternoon route always starts with a pickup at a school and 
ends with a dropoff at a stop. 

The goal is to design a set of minimum cost routes satisfying all existing trans- 
portation guidelines. The major cost component to the Board of Education is the 
cost of leasing each bus and driver, and hence the objective is essentially to mini- 
mize the number of buses needed to feasibly transport the students. Clearly, safety 
is the first consideration, and it is the view of the Board of Education that bus 
routes that meet all transportation guidelines provide a high level of safety. The 
rest is up to the drivers. 

Route feasibility is the most complex aspect of the problem. There are numerous 
side constraints. First, the bus can hold only a limited number of students at one 
time ( capacity constraint). Second, each student must not be on the bus for more 
than a specific amount of time and/or distance ( time or distance constraint). This 
is motivated by the simple observation that the less time spent on the bus the 
safer and more desirable it is for the students. And finally, there are restrictions 
on the time a bus can arrive at a school in the morning, and on the time a bus 
can leave the school in the afternoon ( time window constraints). In many school 
bus routing and scheduling problems, transportation policies specify that students 
from different schools not be put on the same bus at the same time; that is, no 
mixed loads are allowed. Clearly, allowing mixed loads provides increased flexibility 
and therefore can lead to savings in cost. In New York City, for the most part, 
mixed loads are allowed. We list here the primary constraints. There are several 
other constraints which we talk about in Section 18.7. 

We will deal exclusively here with the problem of delivering the students to 
their school in the morning. Researchers have noted that this problem is usually 
more critical than the afternoon problem for two reasons. First, in the afternoon 
the time windows are usually less constraining. For example, in Manhattan (in 
the morning), school starting times fall between 7:30am and 9:00am. That gives 
roughly a one and a half hour time window to pickup all students and take them to 
their schools. In the afternoon, schools end at times over a wider range: anywhere 
between 1:00pm and 4:15pm. Second, traffic congestion is usually higher in the 
morning hours than in the afternoon hours when the students are being bused. 
Therefore, it is very likely more buses will be needed in the morning than in the 
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afternoon. Indeed, our computational experiments reported in Section 18.9 verify 
that this is true in Manhattan. Note that the solution found in the morning cannot 
be simply replicated in the afternoon, that is, having each bus travel the same route 
as in the morning but in the opposite direction. This is true since the sequencing 
of school ending times in the afternoon is different from the sequencing of school 
starting times and therefore schools visited in one order in the morning cannot 
always be visited in the same or opposite order in the afternoon. 

For the morning problem in Manhattan, the specific problem parameters are 
given below. During the 1992-93 academic year, 4,619 students were transported 
by school buses from 838 bus stops to 73 schools. The constraints were as follows. 

• Vehicle capacity constraint. At most 66 students can be on the bus at one 
time. 

• Distance constraint. Each student cannot be on the bus for more than 5 
miles. 

• Time window constraints: Buses must arrive at a school no earlier than 25 
minutes before and no later than 5 minutes before the start of school. 

• The earliest pickup must not be before 7:00 a.m. 

• Mixed loads are allowed. 

The 5-mile distance constraint is not applied uniformly to all students; students 
in District 6 (upper Manhattan) are often transported out of their district due to 
overcrowding. Therefore, since this involves longer trips, sometimes traversing most 
of the island, the 5-mile constraint is not applied to these students. Approximately 
36% of the students in our application were in this group. 

The Manhattan school bus routing problem presents many challenges. First of 
all, the number of bus stops and schools is much larger than those encountered in 
most vehicle routing applications. Second, there are many difficulties involved in 
calculating accurate distances and travel times in New York City. We now consider 
these two points. 



18.5 Distance and Time Estimation 

To accurately estimate distances one needs a precise geographic representation of 
the area. This is achieved using a geographic information system (GIS) which is 
based on data files built from satellite photographs. These files store geographic 
objects, such as streets, highways, parks and rivers that can be presented on a 
computer screen. An important feature is the ability to calculate exact latitudes 
and longitudes of any point. Given a street address, the process of geocoding returns 
the coordinates of the address with very high accuracy. Given these coordinates, it 
is then easy to calculate “as the crow flies” or “Euclidean” distances. Some GISs 
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also have the capability of calculating exact road network distances, that is, the 
distance between two points on the actual street network, sometimes even taking 
into account one-way streets. 

The Office of Pupil Transportation at the Board of Education uses a GIS called 
Maplnfo for Windows. The Maplnfo version used by the City does not have a 
street network representation of New York City. However, such a network has been 
developed by a subcontractor and therefore accurate shortest distances between 
any two points along the street network are readily available. The current version 
also takes into account one-way streets. Although incorporating one-way street 
information may seem like a trivial task, it turned out to be very difficult. We 
believe most current geographic information systems are highly inaccurate with 
regard to one-way streets and are probably unusable without substantial error 
checking. The New York City Department of Transportation does not keep the 
information in an easily retrievable format. We had to resort to checking the one- 
way street sign database at the NYC DOT to reconstruct accurate information 
about one-way streets. Needless to say, the data collection and error checking was 
extremely time consuming. 

Estimating accurate travel times in New York City is probably the trickiest 
part of the problem. As described above, a GIS with a street network represen- 
tation simplifies the calculation of street distances. In addition, in the GIS each 
data structure corresponding to a street segment has space to store the average 
travel speed and/or travel time along the segment. These estimates would make 
it possible to calculate travel times along any path. The difficulty lies, of course, 
in determining these travel speeds. 

Most existing vehicle routing implementations that we are aware of use a fixed 
travel speed throughout the area of interest. Travel times are then determined by 
simply dividing the distance traveled by this universal speed. This method is most 
likely not satisfactory for New York City. Anyone who has driven in New York 
City knows the multitude of different street types and congestion levels that can 
produce a wide variety of different travel speeds. We decided to try to get some 
idea of the average speed in different parts of New York City. 

In addition to performing various timing experiments, we obtained several re- 
ports from the New York City Department of Transportation. These include “Mid- 
town Auto Speeds-Spring 1992” and “Midtown Auto Speeds-Fall 1992.” These re- 
ports provide data on Midtown Manhattan average travel speeds as well as some 
data on the variance of these speeds. (Midtown Manhattan is defined as the rect- 
angular area between First and Eighth Avenues and 30tlr and 60th Streets.) The 
data seem to suggest that speeds vary from an average of 6 miles per hour up to 
about 14 miles per hour, depending on street type, direction and time of day. 

Our approach was to choose an estimate of speed that would be specific to each 
district; thus, a district in the Bronx would not have the same speed estimate as 
one in Midtown Manhattan. These range from about 7 miles per hour to 12 miles 
per hour. An important observation made when collecting data was that when a 
bus experienced below average travel times along the beginning of the route, the 
bus driver will slow down or spend more time at the stops to get back on schedule. 
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In addition, since the students have a scheduled pickup time, the bus cannot, as a 
rule, leave early. It must wait until a specific time before leaving the bus stop. If 
the bus experiences above average travel times (below average speeds), then the 
bus driver can speed up (slightly) and make sure to leave as soon as all students 
are on the bus. Consequently, the travel time is not quite as random as one might 
think. 

To make sure that school buses meet the time window constraints, information 
about travel time along the streets of New York City is not sufficient. The time 
to pick up students from their bus stops and to drop off students at their schools 
must also be taken into account. By riding the buses, we collected data on the time 
it takes to pick up or drop off students at stops or at schools. A linear regression 
was performed on the data providing the following model for the pickup time: 

PTime = 19.0 + 2.6 IV, 

where PTime = pickup time (in seconds), and N = number of students picked 
up at the bus stop. This regression was performed on 30 data points. The R 2 was 
77.7% and the p — value of the independent variable was very small (< 0.001). The 
regression performed on the dropoff times resulted in the equation: 

DTime = 29.0 + 1.91V, 

where DT ime = dropoff time (in seconds) , and N = number of students dropped 
off at the school. This regression was performed on 30 data points. Here the R 2 
was 41.9% and the p — value of the independent variable was 0.01%. In our imple- 
mentation, we used these equations to determine approximate pickup and dropoff 
times. 

Overall, the approximations and calculations made in testing the optimization 
module were designed with the goal of ensuring that a route constructed by the 
algorithm would be a driveable one. The next question is how to generate a good 
feasible solution to the school bus routing and scheduling problem. 



18.6 The Routing Algorithm 

There are many existing algorithms for school bus routing and scheduling. Numer- 
ous communities throughout the world have implemented computerized algorithms 
to perform these tasks. Overall, the success seems to be universally recognized. Al- 
most all papers published in the literature mention cost savings of around 5-10%. 
We recognize that it may be useless to even contemplate the meaning of these sav- 
ings numbers since the savings may not only come from reduction in cost but also 
from increased control of the bus routes. The magnitude of the “savings” is also 
highly dependent on what methods were in use before the computerized system 
was put into place. 

Transferability seems to be the critical factor. It is difficult to compare algo- 
rithms for this problem directly from the literature. Each problem has its own 
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version of the constraints and even objectives. It is not always simple or even pos- 
sible to take an existing algorithm in use in one community and simply apply it 
to another. Each problem has its peculiarities and may also have very different 
constraints. For instance, in an implementation in Montreal, the people designing 
the routes have the freedom to change existing school starting and ending times at 
their convenience. Clearly this added flexibility can simplify the problem to some 
extent, and can lead to additional savings in cost. In New York City this was not 
possible. 

Finally, this is all within the framework of an optimization problem, which we 
have seen is extremely difficult to solve. There is an absence of any strong lower 
bounds on the minimum number of buses required. 

In determining what type of algorithm to apply to this large vehicle routing 
problem, we considered several important aspects of the problem and also the 
setting in which the algorithm would be used. 

Efficiency This is an extremely large problem, so the solution method must be 
efficient in computation time and in space requirement. Assuming optimiza- 
tion might be done by district, some districts have as many as 1,500 bus 
stops. Even though complete optimization of the solution might only be 
done once a year, the time involved in testing and experimenting with the 
problem parameters is reduced considerably if the algorithm is time and 
space efficient. 

Transparency The algorithm would most likely need to be constructive in na- 
ture thereby providing a dispatcher with the ability of viewing the algorithm 
progression in real-time. This makes it possible to detect “problem routes” 
and correct errors without having to wait until the termination of the algo- 
rithm. That is, the approach should build routes in a sequential fashion and 
not, for example, work for hours and finally, in the last moments provide a 
solution. 

Flexibility The heuristic should be flexible enough to handle, not only the con- 
straints currently in place, but additional constraints that might be imposed 
in the future. 

Interactivity From our discussions with the inspectors it is clear that the algo- 
rithm implemented must have an interactive component that would allow an 
experienced inspector to help construct routes using his or her prior knowl- 
edge. That is, the algorithm must be able to work in two different modes. 
First, it must be able to act like a black box, where data are input and a 
solution is output. Second, it must also serve as an interactive tool, where 
a starting solution can be presented along with a set of unrouted stops and 
the algorithm finds the best way to add on to this starting solution. 

Multiple Solutions The algorithm should be capable of producing a series of 
solutions, not simply one solution. This last point is important since each 
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solution would have to be checked by an inspector, and it is possible that 
the inspector will rule out some solutions. 

Finally, the urban nature of our application, in contrast to many of the problems 
seen in the literature, should also be taken into account. As many researchers have 
noted (see Bodin and Berman (1979) and Chapleau et al. (1983)), the vehicle 
capacity constraint tends to be the most binding constraint when routing in an 
urban area. This is due to the general rule that the bus will tend to “fill up” before 
the time constraints become an issue. Therefore, it seems as though algorithms 
developed for the Capacitated Vehicle Routing Problem (CVRP) (see Chapter 
14) should be a good starting point. The difficulty is that the CVRP generally 
has a different objective function: minimize the total distance traveled, not the 
number of vehicles used. Fortunately (see Chapter 14 or Bramel et al. (1991)), if 
the number of pickup points is very large and distances follow a general norm, 
when the distance is minimized, a byproduct of the solution is that the minimum 
number of vehicles will be used. Observe that distances in New York City come 
from the street network, not from a norm; however, since the blocks are short and 
somewhat uniform in size, the street network distance is fairly close to a norm 
distance, and similar results most likely hold. 

For these reasons, our starting point for the algorithm for the school bus routing 
and scheduling problem was the Location Based Heuristic (LBH) (see Section 14.7) 
developed for the CVRP. This algorithm has the important property that it is 
asymptotically optimal for the CVRP (see Section 14.7); that is, the relative error 
between the value of the solution generated by the algorithm and the optimal 
solution value tends to zero as the number of pickup points increases. 

Due to the size and complexity of the problem, we made several changes to the 
LBH. The algorithm is serial in nature as it constructs one route at a time and 
not in parallel. To describe the algorithm, let the bus stops be indexed 1,2, ... , n. 
Let a route run by a single bus be denoted i?,. Let a full solution to the school bus 
routing and scheduling problem be written as a set of routes {R\, R2, ■ ■ ■ ,Rm}, 
where M is the number of buses used. For each bus stop j, let school [j] be the 
index of the school to which the students at stop j are destined. Let U be the set 
of indices of all unvisited pickup points. 

The following algorithm creates one solution to the school bus routing and 
scheduling problem. More solutions can be generated by starting the algorithm 
with different random seeds. 

Randomized LBH: 

Let U = {1, 2, ..., n} and m = 0. 

while {U ^ 0) do 

{ 

Pick a seed stop from U using a selection criterion. Call it j. 

Let U <— U\ {j}. 

Let the current route be R rn = {j — > school [)]}. 
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repeat 

{ 

For each i £ U, calculate c,; =routelength(i, R m ) . 

Let c fc = minjgc/jcj. 

If Cfc < +oo then 

{ 

Let R m <— buildroute(k, R m ). 

Let U <- U\{k}. 

} 

} until Cfc = +oo. 
to <— m + 1. 

} 

M <— TO. 

The heuristic solution is {Ri, R 2 , ■ ■ ■ , Rm}- 

The selection of the seed stops can be done in one of several different ways. 
One approach is to simply select these stops at random from the set of unvisited 
stops. Another approach is to select stops with large loads or stops that have 
tight delivery windows (i.e., the distance and time constraints force these stops 
to be delivered almost directly from the stop to the school with very few stops in 
between). Other criteria were used according to which constraints were binding at 
particular stops. 

The function routelength{i , R) determines the approximate cost of inserting stop 
i into route R. Route R consists of a path through several stops and schools. While 
preserving the order of the stops and schools in route R , we determine the best 
insertion point for stop i. We check each consecutive pair of points (either stops 
or schools) along route R and check whether stop i can be inserted between these 
two. If school[i\ is not in route R , then we must not only find the best insertion 
point for stop i, but also the best insertion point for school[i\. It is possible that no 
insertion point(s) can be found that results in a feasible route. Checking whether 
a stop can be inserted requires checking all the constraints. If no feasible insertion 
point exists, then the value of routelength(i , R) is made + 00 . This indicates that it 
is not possible (while preserving the order of R) to insert stop i into route R. If an 
insertion is found that results in a feasible route, then the value of routelength{i , R) 
is made to be exactly the additional distance traveled. 

To illustrate the difficulty of this step, consider simply the capacity constraint. 
In the case of the CVRP, all loads are dropped off at the same point (the final 
stop); therefore, the maximum load that is carried by the vehicle is when it picks 
up its last load. Therefore, it is easy to check whether a stop can be added to a 
route since we need only check that the maximum load is less than the vehicle 
capacity. This maximum load is always at the last stop, so the calculation is 
easy. By contrast, performing a similar calculation in the school bus routing and 
scheduling problem is much more complicated since there is more than one dropoff 
point. Checking feasibility when adding a stop to a route requires knowing when 
the student is getting on and off the bus, since this will affect whether there is 




18.7 Additional Constraints and Features 



331 



room for a student at future points on the bus route. Therefore, checking whether 
the capacity constraint is violated in the school bus routing problem is much more 
complicated than in the CVRP. 

The function buildroute(k , R ) creates the route that results from the insertion of 
stop k into route R. Again, stop k is simply inserted between the two consecutive 
points (stops or schools) that result in the shortest total route. This route is 
guaranteed to be feasible since Ck < +oo. 

The algorithm satisfies the requirements that we described above. It runs effi- 
ciently for problems of large size and builds routes sequentially. It is very flexible 
in the sense that constraints of almost any type can be included (e.g., disallowing 
mixed loads for some schools). Of course each additional constraint causes the 
algorithm to take a little longer to find a solution. In terms of its interactivity (see 
the next section for details), the algorithm can be used in an interactive mode if 
this is desired. In this mode, a partial routing solution can be used as a starting 
point and unrouted stops can be added efficiently. The inspector can also have a 
major impact on the routes generated by the algorithm via the selection of the 
seed points (see Section 18.8 below for a further discussion on this point). Since 
the algorithm can be easily randomized (by randomizing the seed stop selection 
procedure), starting the algorithm with different random numbers makes it gen- 
erate different solutions. Finally, the most important advantage of this heuristic is 
that it does not decompose the problem into subproblems, but solves the routing 
and scheduling components simultaneously. 



18.7 Additional Constraints and Features 

In the course of the implementation of our algorithm, several additional “soft” 
constraints came to our attention. These are subtle rules that inspectors used when 
constructing feasible routes, which were only determined once a set of routes were 
shown to the inspectors. 

Limit on the number of buses to a particular school This is best explained 
with an example. Consider the situation where a school, say school A, has 
a late starting time relative to other schools, say 9:30am, where all other 
schools start at 9am, and assume only a dozen of the students from school 
A require bus service. Previously, if a solution required 20 buses to serve 
all schools, routers would take one of these and have it alone serve school 
A. That is, some time between 9am and 9:30am one bus would pick up the 
dozen students and deliver them to school A. Since 20 buses are used in the 
solution, this solution is equivalent to, for example, having 6 of the 20 buses 
each deliver 2 students to school A between 9am and 9:30am. This, from a 
cost point of view, is just as good a solution. However, school A may only 
be able to handle one or two buses at a time due to limited driveway space. 
We therefore needed to add a constraint on the number of buses that could 
deliver students to each school. This constraint only became active for a few 
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schools. 

Multi-level relational distance constraints When delivering packages to ware- 
houses or to customers, a distance constraint is usually set on the complete 
route thus limited to the driver’s working day. When delivering students to 
schools, the distance constraint is really student specific. That is, each stu- 
dent’s trip is limited, not just the driver’s. In the school bus routing and 
scheduling problem, the distance constraint also illustrates the difficulty of 
modeling, through simple constraints, a real-life problem. To illustrate this, 
consider the 5-mile distance constraint discussed earlier. We found that this 
simple constraint was actually unsatisfactory for this problem. For example, 
if a student was only 1 mile from school, then it was not considered desirable 
to have him or her end up traveling 5 miles on the bus. This student (and 
maybe more vociferously his or her parents) would not consider this an equi- 
table solution. We therefore decided to implement what we call a relational 
distance constraint. That is, for a multiplier a, say a = 2, a student could 
not travel on the bus for more than a times the distance the student’s bus 
stop was from school. The question was then to what do we set a. We deter- 
mined that the best rule was to divide the region around a particular school 
into concentric rings. For example, if the first ring was 3 miles in radius, then 
a stop that was d < 3 miles from the school would have a distance constraint 
(on the bus) of a±d miles. Ring i was assigned a multiplier an and this was 
repeated for each ring. Although it took some time to determine appropri- 
ate multipliers, eventually this is the type of distance constraint that was 
implemented. 

Waiting time constraint Another constraint that did not come to our atten- 
tion until we presented our routes to the inspectors was the waiting time 
constraint. Again, this is something that is specific to the routing of people 
as opposed to packages. Consider a simple problem with two schools, school 
A starting at 8am and school B starting at 9am. At 7:30 a bus picks up both 
students for schools A and B and then arrives at school A in the time window 
(say at 7:45) and drops off only those students that are going to school A. 
Since school B starts at 9am, the bus waits for half an hour at school A until 
proceeding to pick up some more students for school B and then arriving 
at school B at 8:45 and dropping off all the students. A route of this type, 
where students wait on the bus for half an hour, was definitely not deemed 
acceptable. Therefore, we needed to add a constraint on the amount of time 
a bus (with students on it) can wait idle. Five minutes was the number that 
was eventually used. 

Route balancing It is desirable that the routes in a solution be of similar dura- 
tion and total distance. It does not seems fair if one driver serves morning 
routes from 7am to 7:30am while another works from 7am to 9:30am. The 
balancing of the workloads is partially achieved by implementation of a route- 
balance () subroutine that is called once, at the end of the algorithm. This 
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subroutine essentially moves stops and schools from heavily loaded routes to 
less heavily loaded routes while maintaining feasibility of the solution. This 
seemed to work very well. 

Single route optimization Once a solution is determined, we may (and should) 
optimize the sequencing of the stops and schools on each route individually. 
That is, given a set of stops and schools that can be feasibly served by one 
bus, in terms of service level, what is the “best” route to actually drive? 
An objective that guarantees a high service level is to minimize the total 
number of student-miles traveled (see e.g., Bennett and Gazis (1972)). For 
each route created, we call a procedure called route-opt() which minimizes 
the total number of student-miles while maintaining feasibility of the route. 



18.8 The Interactive Mode 

As we mentioned earlier, the complete rescheduling of all buses might only be done 
once a year (in August). However, throughout the course of the school year there 
are quite a few small changes that must be made to the solution. These changes 
could be caused by, for example: 

• A school, which previously did not request bus service, requests service in 
mid-year. 

• A student changes address or school. 

• A school’s session time changes. 

One option might be simply to reoptimize all routes that are affected by the 
changes. This might cause major disruptions in a large number of routes. These 
disruptions may translate to disruptions in the parents’ morning schedules which 
might overload the Office of Pupil Transportation telephone switchboard. In essence, 
it is desirable to implement the changes while making the fewest disruptions to 
other students’ schedules. 

This was the impetus for the development of the algorithm’s interactive mode. 
Here it is possible to start the algorithm with a number of routes already created 
and to simply add stops to or delete stops from these routes. Let’s consider what 
happens when a stop is added to an existing set of routes. The user has the ability 
to select from one of three options: 

• Complete reoptimization. This corresponds to starting the reoptimization 
from scratch, that is, throwing away all previously created routes. Optimiza- 
tion then starts with all new stops added to the list of stops. 

• Single route reoptimization. This corresponds to selecting a route and check- 
ing whether a particular stop can be added to it. This is done through a 
simple route-check() subroutine. In this case, the route may be completely 
resequenced. 
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• No reoptimization. In this case, the stop is simply inserted between two stops 
on existing routes without any reoptimization. 

Deleting a stop is somewhat easier to do, the user simply clicks the mouse on 
the stop in question and deletes it from the current solution. The fact that this 
may actually render the remaining route infeasible is a good illustration of the 
complexity of the bus routing and scheduling problem. This is due to the waiting 
time constraint mentioned in the previous section. In either case, the user can 
specify whether a reoptimization of the route is desired. 

These optimization tools proved quite useful as they provided simple ways to 
test what-if scenarios; tests that previously would have taken weeks if not months. 



18.9 Data, Implementation and Results 

To assess the effectiveness of our algorithm, we attempted to solve the problem 
using the Manhattan data given to us by the Office of Pupil Transportation, that 
is, to use our algorithm to generate a solution and to check it for actual drivability. 

We solved both the morning and the afternoon problem. We first calculated 
the shortest distance matrix between all 911 points of interest (838 stops and 73 
schools) along the street network. In our implementation, we used a speed of 8 miles 
per hour for the entire borough. This was the lowest average speed in Midtown 
Manhattan along a street or avenue between 7am and 10am (the time interval 
that the bus would be traveling in the morning) reported by the Department of 
Transportation. We feel that this average speed is quite conservative and a bus can 
on average travel more quickly. One reason for this is that the measurement was 
made in Midtown Manhattan, a location with very high congestion throughout 
the day. 

The algorithm was run on a PC (486DX2/50 megahertz) under Windows over a 
period of several hours. To generate its first feasible solution, the algorithm takes 
about 40 minutes. We repeated the algorithm 40 times keeping track of the best 
solution. The algorithm has as output a detailed schedule and directions for each 
bus. 

In order to determine the sensitivity of the results to some of the assumptions we 
have made, we ran the algorithm with several settings for the average travel speed. 
We used 8 mph, 10 mph and 12 mph. Note again these speeds are conservative, as 
we have also taken into account the time to stop and pick up or drop off students. 
The following table lists the number of buses used in the best solutions found for 
each of these settings and for the morning and afternoon problems. 



Table 1: General education routing 
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Universal 


1 Number of Buses Used II 


Speed 


Morning 


Afternoon 


8 mph 


74 


67 


10 mph 


64 


60 


12 mph 


59 


56 



As a comparison, these solutions use substantially fewer buses than are currently 
in use. We clo not expect that the number of buses used will be as low as indicated 
by our preliminary results, due to the fact that the routes have not been checked 
by the inspectors. However, it is reasonable to assume that they will serve as a 
starting solution which can be modified by the inspectors. 
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